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^ (54) Title: POLYNUCLEOTIDE VACCINES EXPRESSING CODON OPTIMIZED HIV-1 POL AND MODIFIED HIV-1 POL 

^ (57) Abstract: Pharmaceutical compositions which comprise HIV Pol DNA vaccines are disclosed, along with the production and 
use of these DNA vaccines. The pol-based DNA vaccines of the invention are administered directly introduced into living vertebrate 
^ tissue, preferably humans, and preferably express inactivated versions of the HIV Pol protein devoid of protease, reverse transcriptase 
activity, RNase H activity and integrase activity, inducing a cellular immune response which specifically recognizes human immun- 
odeficiency virus- 1 (HIV- 1). The DNA molecules which comprise the open reading frame of these DNA vaccines are synthetic DNA 
molecules encoding codon optimized HIV-1 Pol and codon optimized inactive derivatives of optimized HIV-1 Pol, including DNA 
molecules which encode inactive Pol proteins which comprise an amino terminal leader peptide. 
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TITLE OF THE INVENTION 

POLYNUCLEOTIDE VACCINES EXPRESSING CODON OPTIMIZED HTV-1 
5 POL AND MODIFIED HTV-1 POL 

CROSS-REFERENCE TO RELATED APPLICATIONS 

This application claims the benefit, under 35 U.S.C. § 119(e), of U.S. 
provisional application 60/171,542, filed December 22, 1999. 

0 

STATEMENT REGARDING FEDERALLY-SPONSORED R&D 
Not Applicable 

5 REFERENCE TO MICROFICHE APPENDIX 
Not Applicable 

■ * 

FIELD OF THE INVENTION 

The present invention relates to HTV Pol polynucleotide pharmaceutical 

5 products, as well as the production and use thereof which, when directly introduced 
into living vertebrate tissue, preferably a mammalian host such as a human or a 
non-human mammal of commercial or domestic veterinary importance, express the 
HTV Pol protein or biologically relevant portions thereof within the animal, inducing 
cellular immune response which specifically recognizes human immunodeficiency 

5 virus- 1 (HTV-1). The polynucleotides of the present invention are synthetic DNA 
molecules encoding codon optimized HIV-1 Pol and derivatives of optimized HIV-1 
Pol, including constructs wherein protease, reverse transcriptase, RNAse H and 
integrase activity of HIV-1 Pol is inactivated. The polynucleotide vaccines of the 
present invention should offer a prophylactic advantage to previously uninfected 

) individuals and/or provide a therapeutic effect by reducing viral load levels within an 
infected individual, thus prolonging the asymptomatic phase of HIV-1 infection. 



? WO 01/45748 PCT/US00/34724 



BACKGROUND OF THE INVENTION 

Human Immunodeficiency Virus-1 (HIV-1) is the etiological agent of 
acquired human immune deficiency syndrome (AIDS) and related disorders. HIV-1 
is an RNA virus of the Retro viridae family and exhibits the 5 7 LTR-gag-pol-env- 
5 LTR 3' organization of all retroviruses. The integrated form of HIV-1, known as the 
provirus, is approximately 9.8 Kb in length. Each end of the viral genome contains 
flanking sequences known as long terminal repeats (LTRs). The HIV genes encode at 
least nine proteins and are divided into three classes; the major structural proteins 
(Gag, Pol, and En v), the regulatory proteins (Tat and Rev); and the accessory proteins 

10 (Vpu, Vpr, Vif and Nef). 

The gag gene encodes a 55-kilodalton (kDa) precursor protein (p55) which is 
expressed from the unspliced viral mRNA and is proteolytically processed by the HTV 
protease, a product of the pol gene. The mature p55 protein products are pl7 
(matrix), p24 (capsid), p9 (nucleocapsid) and p6. 

15 The pol gene encodes proteins necessary for virus replication; a reverse 

transcriptase, a protease, integrase and RNAse H. These viral proteins are expressed 
as a Gag-Pol fusion protein, a 160 kDa precursor protein which is generated via a 
ribosomal frame shifting. The viral encoded protease proteolytically cleaves the Pol 
polypeptide away from the Gag-Pol fusion and further cleaves the Pol polypeptide to 

20 the mature proteins which provide protease (Pro, P10), reverse transcriptase (RT, 
P50), integrase (IN, p31) and RNAse H (RNAse, pl5) activities. 

The ne/gene encodes an early accessory HIV protein (Nef) which has been 
shown to possess several activities such as down regulating CD4 expression, 
disturbing T-cell activation and stimulating HTV infectivity. - 

25 The env gene encodes the viral envelope glycoprotein that is translated as a 

160-kilodalton (kDa) precursor (gpl60) and then cleaved by a cellular protease to 
yield the external 120-kDa envelope glycoprotein (gpl20) and the transmembrane 41- 
kDa envelope glycoprotein (gp41). Gpl20 and gp41 remain associated and are 
displayed on the viral particles and the surface of HIV-infected cells. 

30 The tat gene encodes a long form and a short form of the Tat protein, a RNA 

binding protein which is a transcriptional transactivator essential for HIV-1 
replication. 

The rev gene encodes the 13 kDa Rev protein, a RNA binding protein. The 
Rev protein binds to a region of the viral RNA termed the Rev response element 
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(RRE). The Rev protein is promotes transfer of unspliced viral RNA from the 
nucleus to the cytoplasm. The Rev protein is required for HTV late gene expression 
and in turn, HIV replication. 

Gpl20 binds to the CD4/chemokine receptor present on the surface of helper 
5 T-lymphocytes, macrophages and other target cells in addition to other co-receptor 
molecules. X4 (macrophage tropic) virus show tropism for CD4/CXCR4 complexes 
while a R5 (T-cell line tropic) virus interacts with a CD4/CCR5 receptor complex. 
After gpl20 binds to CD4, gp41 mediates the fusion event responsible for virus entry. 
The virus fuses with and enters the target cell, followed by reverse transcription of its 

10 single stranded RNA genome into the double-stranded DNA via a RNA dependent 
DNA polymerase. The viral DNA, known as provirus, enters the cell nucleus, where 
the viral DNA directs the production of new viral RNA within the nucleus, expression 
of early and late HTV viral proteins, and subsequently the production and cellular 
release of new virus particles. Recent advances in the ability to detect viral load 

15 within the host shows that the primary infection results in an extremely high 

generation and tissue distribution of the virus, followed by a steady state level of virus 
(albeit through a continual viral production and turnover during this phase), leading 
ultimately to another burst of virus load which leads to the onset of clinical AIDS. 
Productively infected cells have a half life of several days, whereas chronically or 

20 latently infected cells have a 3-week half life, followed by non-productively infected 
cells which have a long half life (over 100 days) but do not significantly contribute to 
day to day viral loads seen throughout the course of disease. 

Destruction of CD4 helper T lymphocytes, which are critical to immune 
defense, is a major cause of the progressive immune dysfunction that is the hallmark 

25 of HTV infection. The loss of CD4 T-cells seriously impairs the body's ability to fight 
most invaders, but it has a particularly severe impact on the defenses against viruses, 
fungi, parasites and certain bacteria, including mycobacteria. 

Effective treatment regimens for HIV-1 infected individuals have become 
available recently. However, these drugs will not have a significant impact on the 

30 disease in many parts of the world and they will have a minimal impact in halting the 
spread of infection within the human population. As is true of many other infectious 
diseases, a significant epidemiologic impact on the spread of HTV-1 infection will 
only occur subsequent to the development and introduction of an effective vaccine. 
There are a number of factors that have contributed to the lack of successful vaccine 
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development to date. As noted above, it is now apparent that in a chronically infected 
person there exists constant virus production in spite of the presence of anti-HIV-1 
humoral and cellular immune responses and destruction of virally infected cells. As 
in the case of other infectious diseases, the outcome of disease is the result of a 
5 balance between the kinetics and the magnitude of the immune response and the 
pathogen replicative rate and accessibility to the immune response. Pre-existing 
immunity may be more successful with an acute infection than an evolving immune 
response can be with an established infection. A second factor is the considerable 
genetic variability of the virus. Although anti-HIV-1 antibodies exist that can 

10 neutralize HTV-1 infectivity in cell culture, these antibodies are generally virus 
isolate-specific in their activity. It has proven impossible to define serological 
groupings of HTV-1 using traditional methods. Rather, the virus seems to define a 
serological "continuum" so that individual neutralizing antibody responses, at best, 
are effective against only a handful of viral variants. Given this latter observation, it 

15 would be useful to identify immunogens and related delivery technologies that are 
likely to elicit anti-HIV-1 cellular immune responses. It is known that in order to 
generate CTL responses antigen must be synthesized within or introduced into cells, 
subsequently processed into small peptides by the proteasome complex, and 
translocated into the endoplasmic reticulum/Golgi complex secretory pathway for 

20 eventual association with major histocompatibility complex (MHC) class I proteins. 
CD8 + T lymphocytes recognize antigen in association with class I MHC via the T cell 
receptor (TCR) and the CD8 cell surface protein. Activation of naive CD8 + T cells 
into activated effector or memory cells generally requires both TCR engagement of 
antigen as described above as well as engagement of costimulatory proteins. Optimal 

25 induction of CTL responses usually requires "help" in the form of cytokines from 
CD4 + T lymphocytes which recognize antigen associated with MHC class II 
molecules via TCR and CD4 engagement. 

Larder, et al., (1987, Nature 327: 716-717) and Larder, et ah, (1989, Proc. 
Natl Acad. ScL 86: 4803-4807) disclose site specific mutagenesis of HIV-1 RT and 

30 the effect such changes have on in vitro activity and infectivity related to interaction 
with known inhibitors of RT. 

Davies, et al. (1991, Science 252:, 88-95) disclose the crystal structure of the 
RNase H domain of HIV-1 Pol. 
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Schatz, et al. (1989, FEBS Lett. 257: 311-314) disclose that mutations 
Glu47SGln and His539Phe in a complete HTV-1 RT/RNase H DNA fragment results 
in defective RNase activity without effecting RT activity. 

Mizrahi, et al. (1990, Nucl. Acids. Res. 18: pp. 5359-5353) disclose additional 
5 mutations Asp443Asn and Asp498Asn in the RNase region of the pol gene which also 
results in defective RNase activity. The authors note that the Asp498Asn mutant was 
difficult to characterize due to instability of this mutant protein. 

Leavitt, et al. (1993, /. Biol Chem. 268: 2113-2119) disclose several 
mutations, including a Asp64Val mutation, which show differing effect on HTV-1 
10 integrase (IN) activity. 

Wiskerchen, et al. (1995, J. Virol. 69: 376-386) disclose singe and double 
mutants, including mutation of aspartic acid residues which effect HIV-1 IN 'and viral 
replication functions. 

It would be of great import in the battle against AIDS to produce a 
15 prophylactic- and/or therapeutic-based HIV vaccine which generates a strong cellular 
immune response against an HTV infection. The present invention addresses and 
meets this needs by disclosing a class of DNA vaccines based on host delivery and 
expression of modified versions of the HIV-1 gene, pol. 

20 SUMMARY OF THE INVENTION 

The present invention relates to synthetic DNA molecules (also referred to 
herein as "polynucleotides") and associated DNA vaccines (also referred to herein as 
"polynucleotide vaccines") which elicit cellular immune and humoral responses upon 
administration to the host, including primates and especially humans, and also 

25 including a non-human mammal of commercial or domestic veterinary importance. 
An effect of the cellular immune-directed vaccines of the present invention should be 
the lower transmission rate to previously uninfected individuals and/or reduction in 
the levels of the viral loads within an infected individual, so as to prolong the 
asymptomatic phase of HIV-1 infection. In particular, the present invention relates to 

30 DNA vaccines which encode various forms of HTV-1 Pol, wherein administration, 
. intracellular delivery and expression of the HIV-1 Pol gene of interest elicits a host 
CTL and Th response. The preferred synthetic DNA molecules of the present 
invention encode codon optimized versions of wild type HTV-1 Pol, codon optimized 
versions of HTV-1 Pol fusion proteins, and codon optimized versions of HTV-1 Pol 
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proteins and fusion protein, including but not limited to pol modifications involving 
residues within the catalytic regions responsible for RT, RNase and IN activity within 
the host cell. 

A particular embodiment of the present invention relates to codon optimized 
wt-pol DNA constructs wherein DNA sequences encoding the protease (PR) activity 
are deleted, leaving codon optimized "wild type" sequences which encode RT 
(reverse transcriptase and RNase H activity) and IN integrase activity. The nucleotide 
sequence of a DNA molecule which encodes this protein is disclosed herein as SEQ 
ID NO:l and the corresponding amino acid sequence of the expressed protein is 
disclosed herein as SEQ ID NO:2. 

The present invention preferably relates to a HIV-1 DNA pol construct which 
is devoid of DNA sequences encoding any PR activity, as well as containing a 
mutation(s) which at least partially, and preferably substantially, abolishes RT, RNase 
and/or IN activity. One type of HIV-1 pol mutant may include but is not limited to a 
mutated DNA molecule comprising at least one nucleotide substitution which results 
in a point mutation which effectively alters an active site within the RT, RNase and/or 
IN regions of the expressed protein, resulting in at least substantially decreased 
enzymatic activity for the RT, RNase H and/or EST functions of HIV-1 Pol. In a 
preferred embodiment of this portion of the invention, a HIV-1 DNA pol construct 
contains a mutation or mutations within the Pol coding region which effectively 
abolishes RT, RNase H and IN activity. An especially preferable HIV-1 DNA pol 
construct in a DNA molecule which contains at least one point mutation which alters 
the active site of the RT, RNase H and IN domains of Pol, such that each activity is at 
least substantially abolished. Such a HIV-1 Pol mutant will most likely comprise at 
least one point mutation in or around each catalytic domain responsible for RT, 
RNase H and IN activity, respectfully. To this end, an especially preferred HtV-1 
DNA pol construct is exemplified herein and contains nine codon substitution 
mutations which results in an inactivated Pol protein (IA Pol: SEQ ID NO:4, Figure 
2A-C) which has no PR, RT, RNase or IN activity, wherein three such point 
mutations reside within each of the RT, RNase and IN catalytic domains. Any 
combination of the mutations disclosed herein may suitable and therefore may be 
utilized as an IA-Pol-based vaccine of the present invention. While addition and 
deletion mutations are contemplated and within the scope of the invention, the 
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preferred mutation is a point mutation resulting in a substitution of the wild type 
amino acid with an alternative amino acid residue. 

Another aspect of the present invention is to generate HTV-1 Pol-based 
vaccine constructions which comprise a eukaryotic trafficking signal peptide such as 
5 the leader peptide from human tPA. To this end, the present invention relates to a 
DNA molecule which encodes a codon optimized wt-pol DNA construct wherein the 
protease (PR) activity is deleted and a human tPA leader sequence is fused to the 5' 
end of the coding region. A DNA molecule which encodes this protein is disclosed 
herein as SEQ ID NO:5, the open reading frame disclosed herein as SEQ ID NO:6. 

10 The present invention especially relates to a HTV-1 Pol mutant such as IA-Pol 

(SEQ ID NO:4) which comprises a leader peptide, such as the human tPA leader, at the 
amino terminal portion of the protein, which may effect cellular trafficking and hence, 
immunogenicity of the expressed protein within the host cell. Any such HTV-1 DNA pol 
mutant disclosed in the above paragraphs is suitable for fusion downstream of a leader 

15 peptide, including but by no means limited to the human tPA leader sequence. Therefore, 
any such leader peptide-based HTV-1 pol mutant construct may include but is not limited 
to a mutated DNA molecule which effectively alters the catalytic activity of the RT, 
RNase and/or IN region of the expressed protein, resulting in at least substantially 
decreased enzymatic activity one or more of the RT, RNase H and/or IN functions of 

20 HTV-1 Pol. Li a preferred embodiment of this portion of the invention, a leader 

peptide/HTV-1 DNA pol construct contains a mutation or mutations within the Pol coding 
region which effectively abolishes RT, RNase H and IN activity. An especially 
preferable HTV-1 DNA pol construct is a DNA molecule which contains at least one point 
mutation which alters the active site and catalytic activity within the RT, RNase H and IN 

25 domains of Pol, such that each activity is at least substantially abolished, and preferably 
totally abolished. Such a HTV-1 Pol mutant will most likely comprise at least one point 
mutation in or around each catalytic domain responsible for RT, RNase H and IN activity, 
respectfully. An especially preferred embodiment of this portion of the invention relates 
to a human tPA leader fused to the IA-Pol protein comprising the nine mutations shown 

30 in Table 1 . The DNA molecule is disclosed herein as SEQ ID NO:7 and the expressed 
tPA-IA Pol protein comprises a fusion junction as shown in Figure 3. The complete 
amino acid sequence of the expressed protein is set forth in SEQ ID NO:8. 

The present invention also relates to a substantially purified protein expressed 
from the DNA polynucleotide vaccines of the present invention, especially the purified 
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proteins set forth below as SEQ ID NOs: 2, 4, 6, and 8. These purified proteins may be 
useful as protein-based HIV vaccines. 

The present invention also relates to non-codon optimized versions of DNA 
molecules and associated polynucleotides and associated DNA vaccines which 
5 encode the various wild type and modified forms of the HIV Pol protein disclosed 
herein. Partial or fully codon optimized DNA vaccine expression vector constructs 
are preferred, but it is within the scope of the present invention to utilize "non-codon 
optimized" versions of the constructs disclosed herein, especially modified versions of 
HIV Pol which are shown to promote a substantial cellular immune and humoral 

10 immune responses subsequent to host administration. 

The DNA backbone of the DNA vaccines of the present invention are 
preferably DNA plasmid expression vectors. DNA plasmid expression vectors 
utilized in the present invention include but are not limited to constructs which 
comprise the cytomegalovirus promoter with the intron A sequence (CMV-intA) and 

15 a bovine growth hormone -transcription termination sequence. In addition, DNA 

plasmid vectors of the present invention preferably comprise an antibiotic resistance 
marker, including but not limited to an ampicillin resistance gene, a neomycin 
resistance gene or any other pharmaceutically acceptable antibiotic resistance marker. 
In addition, an appropriate polylinker cloning site and a prokaryotic origin of 

20 replication sequence are also preferred. Specific DNA vectors exemplified herein 

include VI, V1J (SEQ ID NO:13), VlJneo (SEQ ID NO:14), VlJns (Figure 1A, SEQ 
ID NO: 15), V1R (SEQ ID NO:26), and any of the aforementioned vectors wherein a 
nucleotide sequence encoding a leader peptide, preferably the human tPA leader, is 
fused directly downstream of the CMV-intA promoter, including but not limited to 

25 VI Jns-tpa, as shown in Figure IB and SEQ ID NO:28. 

The present invention especially relates to a DNA vaccine and a 
pharmaceutically active vaccine composition which contains this DNA vaccine, and 
the use as prophylactic and/or therapeutic vaccine for host immunization, preferably 
human host immunization, against an HIV infection or to combat an existing HIV 

30 condition. These DNA vaccines are represented by codon optimized DNA molecules 
encoding codon optimized HIV-1 Pol (e.g. SEQ ID NO:2), codon optimized HTV-1 
Pol fused to an amino terminal localized leader sequence (e.g. SEQ ID NO:6), and 
especially preferable, and the essence of the present invention, biologically inactive 
Pol proteins (IA Pol; e.g., SEQ ID NO:4) devoid of significant PR, RT, RNase or IN 
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activity associated with wild type Pol and a concomitant construct which contains a 
leader peptide at the amino terminal region of the IA Pol protein. These constructs 
are ligated within an appropriate DNA plasniid vector, with or without a nucleotide 
sequence encoding a functional leader peptide. Preferred DNA vaccines of the 
5 present invention comprise codon optimized DNA molecules encoding codon 

optimized HIV-1 Pol and inactivated version of Pol, ligated in DNA vectors disclosed 
herein, or any of the aforementioned vectors wherein a nucleotide sequence encoding 
a leader peptide, preferably the human tPA leader, is fused directly downstream of the 
CMV-intA promoter, including but not limited to VlJns-tpa, as shown in Figure IB 
10 and SEQ ID NO:28. 

Therefore, the present invention relates to DNA vaccines which include, but 
are in no way limited to VI Jns-WTPol (comprising the DNA molecule encoding WT 
Pol, as set forth in SEQ ID NO:2), VI Jns-tPA-WTPol, (comprising the DNA 
molecule encoding tPA Pol, as set forth in SEQ ID NO:6), VI Jns-IAPol (comprising 
15 the DNA molecule encoding IA Pol, as set forth in SEQ ID NO:4), and VI Jns-tPA- 
IAPol, (comprising the DNA molecule encoding tPA-IA Pol, as set forth in SEQ ID 
NO:8). Especially preferred are VlJns-IAPol and VI Jns-tPA-IAPol, as exemplified 
in Example Section 2. 

The present invention also relates to HIV Pol polynucleotide 
: 20 pharmaceutical products, as well as the production and use thereof, wherein the 
DNA vaccines are formulated with an adjuvant or adjuvants which may increase 
immunogenicity of the DNA polynucleotide vaccines of the present invention, • 
namely by promoting an enhanced cellular and/or humoral response subsequent to 
inoculation. A preferred adjuvant is an aluminum phosphate-based adjuvant or a 
25 calcium phosphate based adjuvant, with an aluminum phosphate adjuvant being 
especially preferred. Another prefenred adjuvant is a non-ionic block copolymer, 
preferably comprising the blocks of polyoxyethylene (POE) and 
polyoxypropylene (POP) such as a POE-POP-POE block copolymer. These 
adjuvanted forms comprising the DNA vaccines disclosed herein are useful in 
30 increasing cellular responses to DNA vaccination. 

As used herein, a DNA vaccine or DNA polynucleotide vaccine is a DNA 
molecule (i.e., "nucleic acid", "polynucleotide") which contains essential regulatory 
elements such that upon introduction into a living, vertebrate cell, it is able to direct 
the cellular machinery to produce translation products encoded by the respective pol 
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genes of the present invention. 

BRIEF DESCRIPTION OF THE FIGURES 

r 

Figure 1A-B shows schematic representation of DNA vaccine expression 
5 vectors VI Jns (A) and VlJns-tPA (B) utilized for HIV-1 pol and HIV-1 modified pol 
constructs. 

Figure 2A-C shows the nucleotide (SEQ ID NO: 3) and amino acid sequence 
(SEQ ID NO:4) of IA-Pol. Underlined codons and amino acids denote mutations, as 
listed in Table 1. 

10 Figure 3 shows the codon optimized nucleotide and amino acid sequences 

through the fusion junction of tPA-IA-Pol (contained within SEQ ED NOs: 7 and 8, 
respectively). The underlined portion represents the NH 2 -terminal region of IA-Pol. 

Figure 4 shows generation of a humoral response (measured as the geometric 
means of anti-RT endpoint titers) from mice immunized with one or two doses of 

15 codon optimized VI Jns-IApol and VI Jns-tpa-IApol. A portion of mice that received • 
30 ug of each plasmid was boosted at T=8 wks; sera from all mice were collected at 4 
wk post dose 2. 

Figure 5 shows the number of IFN-gamma secreting cells per 10e6 cells 
following stimulation with pools of either CD4 + (aa641-660, aa73 1-750) orCD8 + 

20 (aa201-220, aa31 1-330, aa571-590, aa781-800) specific peptides of splenocytes (pool 
of 5 spleens/cohort) from control mice and those vaccinated with increasing single 
dose of codon optimized VI Jns-IApol or 30 ug of codon optimized VI Jns-tpa-IApol 
(13 wks post dose 1). Mice (n=5) vaccinated with a second dose of 30 ug of either 
plasmid were analyzed in an Elispot assay at 6 wks post dose 2. Reported are the 

25 sums of the number of spots stimulated by each individual CD8 + peptides because the 
spots in the wells to which the pool was added are too dense to acquire accurate 
counts. The CD4 + cell counts are taken from the responses to the peptide pool. Error 
bars represent standard deviations for counts from triplicate wells per sample per 
antigen. 

30 Figure 6A-C shows ELIspot analysis of peripheral blood cells collected from 

rhesus macaques immunized three times (T=0, 4, 8 wks) with 5 mgs of codon 
optimized HIV-1 Pol expressing plasmids. Antigen-specific IFN-gamma secretion 
was stimulated by adding one of two pools consisting of 20-mer peptides derived 
from vaccine sequence (mpol-1, aal-420; mpol-2, aa41 1-850). (A) Frequencies of 
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spot-forming cells (SFC) as a function of time for 3 monkeys (Tag No. 94R008, 
94R013, 94R033) vaccinated with VlJns-IApol. The reported values are corrected 
for background responses without peptide restimulation. (B) Frequencies of spot- 
forming cells (SFC) as a function of time for 3 monkeys (Tag No. 920078, 920073, 
5 94R028) vaccinated with 5mgs of VI Jns-tpa-IApol. (C) ELIspot responses were also 
measured from a monkey (920072) that did not receive any immunization. 

Figure 7A-B show bulk CTL killing from rhesus macaques immunized with 
codon optimized VlJns-IApol (A)or codon optimized VI Jns-tpa-IApol (B) at 8 weeks 
following the third vaccination. Restimulation was performed using recombinant 

10 vaccinia virus expressing pol and target cells were prepared by pulsing with the 
peptide pools, mpol-1 and mpol-2. 

Figure 8 shows detection of in vitro pol expression from cell lysates of 293 
cells transfected with 10 ug of various pol constructs. Bands were detected using anti- 
serum from an HTV-1 seropositive human subject. Equal amounts of total protein 

15 were loaded for each lane. The lanes contain the lysates from cells transfected with 
the following: 1: mock; 2: VlJns-wt-pol; 3: VlJns-IApol (codon optimized); 
4: VI Jns-tpa-IApol (codon optimized); 5: VlJns-tpa-pol (codon optimized); 6: V1R- 
wt-pol (codon optimized); 7: blank; and 8: 80 ng RT. 

Figure 9 shows the geometric mean anti-RT titers (GMT) plus the standard 

20 errors of the geometric means for cohorts of 5 mice that received one (open circles) or 
two doses (solid circles) of 1, 10, 100 \ig of VIR-wt-pol (codon optimized) or VUns- 
wt-pol. Sera from all animals were collected at 2 weeks post dose 2 (or 7 wks post 
dose 1) and assayed simultaneously. Statistical analyses were performed to compare 
cohorts that received the same amount and number of immunization of either 

25 plasmids; p values (two-tail) less than 5% are above the bars the connect the 
correlated cohorts to reflect statistically significant differences. 

Figure 10 shows cellular immune responses in BALB/c mice vaccinated i.m. 
with 1 (pdl) or 2 (pd2) doses of varying amounts of either wt-pol (virus derived) or 
wt-pol (codon optimized) plasmids. At 3 wks post dose 2, frequencies of IFN-y- 

30 secreting splenocytes are determined from pools of 5 spleens per cohort against 

mixtures of either CD4* peptides (aa21-40, aa41 1-430, aa53 1-550, aa641-660, aa731- 
750, aa771-790) or CD8 + peptides (aa201-220, aa31 1-330) at 4 |0,g/mL final 
concentration per peptide. 
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DETAILED DESCRIPTION OF THE INVENTION 

The present invention relates to synthetic DNA molecules and associated 
DNA vaccines which elicit CTL and Th cellular immune responses upon 
administration to the host, including primates and especially humans. An effect of the 
5 cellular immune-directed vaccines of the present invention should be a lower 

transmission rate to previously uninfected individuals and/or reduction in the levels of 
the viral loads within an infected individual, so as to prolong the asymptomatic phase 
of HTV-1 infection. In particular, the present invention relates to DNA vaccines 
which encode various forms of HTV-1 Pol, wherein administration, intracellular 

10 delivery and expression of the HIV-1 Pol gene of interest elicits a host CTL and Th 
response. The preferred synthetic DNA molecules of the present invention encode 
codon optimized wild type Pol (without Pro activity) and various codon optimized 
inactivated HTV-1 Pol proteins. The HTV-1 pol constructs disclosed herein are 
■ especially preferred for pharmaceutical uses, especially for human administration as a 

15 DNA vaccine. The HIV-1 genome employs predominantly uncommon codons 

compared to highly expressed human genes. Therefore, the pol open reading frame 
has been synthetically manipulated using optimal codons for human expression. As 
noted above, a preferred embodiment of the present invention relates to DNA 
molecules which comprise a HIV-1 pol open reading frame, whether encoding full 

20 length pol or a modification or fusion as described herein, wherein the codon usage 
has been optimized for expression in a mammal, especially a human. 

The synthetic pol gene disclosed herein comprises the coding sequences for 
the reverse transcriptase (or RT which consists of a polymerase and RNase H activity) 
and integrase (IN). The protein sequence is based on that of Hxb2r, a clonal isolate of 

25 IHB; this sequence has been shown to be closest to the consensus clade B sequence 
with only 16 nonidentical residues out of 848 (Korber, et al., 1998, Human 
retroviruses and AIDS, Los Alamos National Laboratory, Los Alamos, New Mexico). 
The skilled artisan will understand after review of this specification that any available 
HTV-1 or HTV-2 strain provides a potential template for the generation of HIV pol 

30 DNA vaccine constructs disclosed herein. It is further noted that the protease gene is 
excluded from the DNA vaccine constructs of the present invention to insure safety 
from any residual protease activity in spite of mutational inactivation. The design of 
the gene sequences for both wild-type (wt-pol) and inactivated pol. (IA-pol) 
incorporates the use of human preferred ("humanized") codons for each amino acid 
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residue in the sequence in order to maximize in vivo mammalian expression (Lathe, 
1985, J. Mol. Biol. 183:1-12). As can be discerned by inspecting the codon usage in 
SEQ ID NOs: 1, 3, 5 and 7, the following codon usage for mammalian optimization is 
preferred: Met (ATG), Gly (GGC), Lys (AAG), Trp (TGG), Ser (TCC), Arg (AGG), 
5 Val (GTG), Pro (GCC), Thr (ACC), Glu (GAG); Leu (CTG), His (CAC), He (ATC), 
Asn (AAC), Cys (TGC), Ala (GCC), Gin (CAG), Phe (TTC) and Tyr (TAC). For an 
additional discussion relating to mammalian (human) codon optimization, see 
WO 97/3 1115 (PCT/US97/02294), which is hereby incorporated by reference. It is 
intended that the skilled artisan may use alternative versions of codon optimization or 

10 may omit this step when generating HIV pol vaccine constructs within the scope of 
the present invention. Therefore, the present invention also relates to non-codon 
optimized versions of DNA molecules and associated DNA vaccines which encode 
the various wild type and modified forms of the HTV Pol protein disclosed herein. 
However, codon optimization of these constructs is a preferred embodiment of this 

15 invention; 

A particular embodiment of the present invention relates to codon optimized 
wt-pol DNA constructs (herein, "wt-pol" or "wt-pol (codon optimized))" wherein 
DNA sequences encoding the protease (PR) activity are deleted, leaving codon . 
optimized "wild type" sequences which encode RT (reverse transcriptase and RNase 
20 H activity) and IN integrase activity. A DNA molecule which encodes this protein is 
disclosed herein as SEQ ID NO:l, the open reading frame being contained from an 
initiating Met residue at nucleotides 10-12 to a termination codon from nucleotides 
2560-2562. SEQ ID NO:l is as follows: 

AGATCTACCA TGGCCCCCAT CTCCCCCATT GAGACTGTGC CTGTGAAGCT GAAGCCTGGC 
25 ATGGATGGCC CCAAGGTGAA GCAGTGGCCC CTGACTGAGG AGAAGATCAA GGCCCTGGTG 
GAAATCTGCA CTGAGATGGA GAAGGAGGGC AAAATCTC C A AGATTGGCCC CGAGAACCCC 
TACAACACCC CTGTGTTTGC CATCAAGAAG AAGGACTCCA CCAAGTGGAG GAAGCTGGTG 
GACTTCAGGG AGCTGAACAA GAGGACCCAG GACTTCTGGG AGGTGCAGCT GGGCATCCCC 
CACCCCGCTG GCCTGAAGAA GAAGAAGTCT GTGACTGTGC TGGATGTGGG GGATGCCTAC 
30 TTCTCTGTGC CCCTGGATGA GGACTTCAGG AAGTACACTG CCTTCACCAT CCCCTCCATC 
AACAATGAGA CCCCTGGCAT CAGGTACCAG TACAATGTGC TGCCCCAGGG CTGGAAGGGC 
TCCCCTGCCA TCTTCCAGTC CTCCATGACC AAGATCCTGG AGCCCTTCAG GAAGCAGAAC 
CCTGACATTG TGATCTACCA GTACATGGAT GACCTGTATG TGGGCTCTGA CCTGGAGATT 
GGGCAGCACA GGACCAAGAT TGAGGAGCTG AGGCAGCACC TGCTGAGGTG GGGCCTGACC 

-13- 



WO 01/45748 PCT/US00/34724 



ACCCCTGACA AGAAGCACCA GAAGGAGCCC 
CCCGACAAGT GGACTGTGCA GCCCATTGTG 
GACATCCAGA AGCTGGTGGG CAAGCTGAAC 
GTGAGGCAGC TGTGCAAGCT GCTGAGGGGC 
5 ACTGAGGAGG CTGAGCTGGA GCTGGCTGAG 
GGGGTGTACT ATGACCCCTC CAAGGACCTG 
CAGTGGACCT ACCAAATCTA CCAGGAGCCC 
AGGATGAGGG GGGCCCACAC CAATGATGTG 
ACCACTGAGT CCATTGTGAT CTGGGGCAAG 

10 GAGACCTGGG AGACCTGGTG GACTGAGTAC 
TTTGTGAACA CCCCCCCCCT GGTGAAGCTG 
GGGGCTGAGA CCTTCTATGT. GGATGGGGCT 
GGCTATGTGA CCAACAGGGG CAGGCAGAAG 
- AAGACTGAGC TCCAGGCCAT CTACCTGGCC 

15 GTGACTGACT CCCAGTATGC CCTGGGCATC 
GAGCTGGTGA ACCAGATCAT TGAGCAGCTG 
GTGCCTGCCC ACAAGGGCAT TGGGGGCAAT 
ATCAGGAAGG TGCTGTTCCT GGATGGCATT 
CACTCCAACT GGAGGGCTAT GGCCTCTGAC 

20 ATTGTGGCCT CCTGTGACAA GTGCCAGCTG 
TGCTCCCCTG GCATCTGGCA GCTGGACTGC 
GCTGTGCATG TGGCCTCCGG CTACATTGAG 
GAGACTGCCT ACTTCCTGCT GAAGCTGGCT 
GACAATGGCT CCAACTTCAC TGGGGCCACA 

25 AAGCAGGAGT TTGGCATCCC CTACAACCCC 
AAGGAGCTGA AGAAGATCAT TGGGCAGGTG 
GTGCAGATGG CTGTGTTCAT CCACAACTTC 
GCTGGGGAGA GGATTGTGGA CATCATTGCC 
CAGATCACCA AGATCCAGAA CTTCAGGGTG 

30 AAGGGCCCTG CCAAGCTGCT GTGGAAGGGG 
GACATCAAGG TGGTGCCCAG GAGGAAGGCC 
GCTGGGGATG ACTGTGTGGC CTCCAGGCAG 
ID N0:1) . 



CCCTTCCTGT GGATGGGCTA TGAGCTGCAC 
CTGCCTGAGA AGGACTCCTG GACTGTGAAT 
TGGGCCTCCC AAATCTACCC TGGCATCAAG 
ACCAAGGCCC TGACTGAGGT GATCCCCCTG 
AACAGGGAGA TCCTGAAGGA GCCTGTGCAT 
ATTGCTGAGA TCCAGAAGCA GGGCCAGGGC 
TTCAAGAACC TGAAGACTGG CAAGTATGCC 
AAGCAGCTGA CTGAGGCTGT GCAGAAGATC 
ACCCCCAAGT TCAAGCTGCC CATCCAGAAG 
TGGCAGGCCA CCTGGATCCC TGAGTGGGAG 
TGGTACCAGC TGGAGAAGGA GCCCATTGTG 
GCCAACAGGG AGACCAAGCT GGGCAAGGCT . 
GTGGTGACCC TGACTGACAC CACCAACCAG 
CTCCAGGACT CTGGCCTGGA GGTGAACATT 
ATCCAGGCCC AGCCTGATCA GTCTGAGTCT 
ATCAAGAAGG AGAAGGTGTA CCTGGCCTGG 
GAGCAGGTGG ACAAGCTGGT GTCTGCTGGC 
GACAAGGCCC AGGATGAGCA TGAGAAGTAC 
TTCAACCTGC CCCCTGTGGT GGCTAAGGAG 
AAGGGGGAGG CCATGCATGG GCAGGTGGAC • 
ACCCACCTGG AGGGCAAGGT GATCCTGGTG 
GCTGAGGTGA TCCCTGCTGA GACAGGCCAG 
GGCAGGTGGC CTGTGAAGAC CATCCACACT 
GTGAGGGCTG CCTGCTGGTG GGCTGGCATC 
CAGTCCCAGG GGGTGGTGGA GTCCATGAAC 
AGGGACCAGG CTGAGCACCT GAAGACAGCT 
AAGAGGAAGG GGGGCATCGG GGGCTACTCC 
ACAGACATCC AGACCAAGGA GCTCCAGAAG 
TACTACAGGG ACTCCAGGAA CCCCCTGTGG 
GAGGGGGCTG TGGTGATCCA GGACAACTCT 
AAGATCATCA GGGACTATGG CAAGCAGATG 
GATGAGGACT AAAGCCCGGG CAGATCT (SEQ 
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The open reading frame of the wild type pol construct disclosed as SEQ ID 
NO:l contains 850 amino acids, disclosed herein as SEQ ID NO:2, as follows: 

Met Ala Pro He Ser Pro He Glu Thr Val Pro Val Lys Leu Lys Pro 
Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu Glu Lys 
He Lys Ala Leu. Val Glu He Cys Thr Glu Met Glu Lys Glu Gly Lys 
He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe Ala 
He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp Phe Arg 
Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu Gly He 
Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val Leu Asp 
Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe Arg Lys 
Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro Gly He 
Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser Pro Ala 
He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg Lys Gin 
Asn Pro Asp He Val He Tyr Gin Tyr Met Asp Asp Leu Tyr Val Gly 
Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu Leu Arg , 
Gin His Leu Leu Arg Trp Gly Leu Thr Thr Pro Asp Lys Lys His Gin 
Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro Asp Lys 
Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp Thr Val 
Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser Gin He 
Tyr Pro Gly He Lys Val Arg Gin Leu Cys Lys Leu Leu Arg Gly Thr 
Lys Ala Leu Thr Glu Val He Pro Leu Thr Glu Glu Ala Glu Leu Glu 
Leu Ala Glu Asn Arg Glu He Leu Lys Glu Pro Val His Gly Val Tyr 
Tyr Asp Pro Ser Lys Asp Leu He Ala Glu He Gin Lys Gin Gly Gin 
Gly Gin Trp Thr Tyr Gin He Tyr Gin Glu Pro Phe Lys Asn Leu Lys 
Thr Gly Lys Tyr Ala Arg Met Arg Gly Ala His Thr Asn Asp Val Lys 
Gin Leu Thr Glu Ala Val Gin Lys He Thr Thr Glu Ser He Val He 
Trp Gly Lys Thr Pro Lys Phe Lys Leu Pro He Gin Lys Glu Thr Trp 
Glu Thr Trp Trp Thr Glu Tyr Trp Gin Ala Thr Trp He Pro Glu Trp 
Glu Phe Val Asn Thr Pro Pro Leu Val Lys Leu Trp Tyr Gin Leu Glu 
Lys Glu Pro He Val Gly Ala Glu Thr Phe Tyr Val Asp Gly Ala Ala 
Asn Arg Glu Thr Lys Leu Gly Lys Ala Gly Tyr Val Thr Asn Arg Gly 
Arg Gin Lys Val Val Thr Leu Thr Asp Thr Thr Asn Gin Lys Thr Glu 
Leu Gin Ala He Tyr Leu Ala Leu Gin Asp Ser Gly Leu Glu Val Asn 
He val Thr Asp Ser Gin Tyr Ala Leu Gly He He Gin Ala Gin Pro 
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Asp Gin Ser Glu Ser Glu Leu Val Asn Gin lie lie Glu Gin Leu lie 
Lys Lys Glu Lys Val Tyr Leu Ala Trp Val Pro Ala His Lys Gly lie 
Gly Gly Asn Glu Gin Val Asp Lys Leu Val Ser Ala Gly lie Arg Lys 
Val Leu Phe Leu Asp Gly lie Asp Lys Ala Gin Asp Glu His Glu Lys 
5 Tyr His Ser Asn Trp Arg Ala Met Ala Ser Asp Phe Asn Leu Pro Pro 
Val Val Ala Lys Glu lie Val Ala Ser Cys Asp Lys Cys Gin Leu Lys 
Gly Glu Ala Met His Gly Gin Val Asp Cys Ser Pro Gly lie Trp Gin 
Leu Asp Cys Thr His Leu Glu Gly Lys Val lie Leu Val Ala Val His 
Val Ala Ser Gly Tyr He Glu Ala Glu Val He Pro Ala Glu Thr Gly 

10 Gin Glu Thr Ala Tyr Phe Leu Leu Lys Leu Ala Gly Arg Trp Pro Val 
Lys Thr He His Thr Asp Asn Gly Ser Asn Phe Thr Gly Ala Thr Val 
Arg Ala Ala Cys Trp Trp Ala Gly He Lys Gin Glu Phe Gly He Pro 
Tyr Asn Pro Gin Ser Gin Gly Val Val Glu Ser Met Asn Lys Glu Leu 
Lys Lys. He He Gly Gin Val Arg Asp Gin Ala Glu His Leu Lys Thr 

15 Ala Val Gin Met Ala Val Phe He His Asn Phe Lys Arg Lys Gly Gly 
He Gly Gly Tyr Ser Ala Gly Glu Arg He Val Asp He He Ala Thr 
Asp He Gin Thr Lys Glu Leu Gin Lys Gin He Thr Lys He Gin Asn 
Phe Arg Val Tyr Tyr Arg Asp Ser Arg Asn Pro Leu Trp Lys Gly Pro 
Ala Lys Leu Leu Trp Lys Gly Glu Gly Ala Val Val He Gin Asp Asn 

20 Ser Asp He Lys Val Val Pro Arg Arg Lys Ala Lys He He Arg Asp 
Tyr Gly Lys Gin Met Ala Gly Asp Asp Cys Val Ala Ser Arg Gin Asp 
Glu Asp (SEQ ID NO:2) . 

The present invention especially relates to a codon optimized HIV-1 DNA pol 
construct wherein, in addition to deletion of the portion of the wild type sequence 

25 encoding the protease activity, a combination of active site residue mutations are 

introduced which are deleterious to HIV-1 pol (RT-RH-IN) activity of the expressed 
protein. Therefore, the present invention preferably relates to a HIV-1 DNA pol 
construct which is devoid of DNA sequences encoding any PR activity, as well as 
containing a mutation(s) which at least partially, and preferably substantially, 

30 abolishes RT, RNase and/or IN activity. One type of HTV-1 pol mutant may include 
but is not limited to a mutated DNA molecule comprising at least one nucleotide . 
substitution which results in a point mutation which effectively alters an active site 
within the RT, RNase and/or IN regions of the expressed protein, resulting in at least 
substantially decreased enzymatic activity for the RT, RNase H and/or IN functions of 

-16- 



WO 01/45748 



PCT/US00/34724 



10 



15 



20 



HTV-1 Pol. In a preferred embodiment of this portion of the invention, a HTV-1 DNA 
pol construct contains a mutation or mutations within the Pol coding region which 
effectively abolishes RT, RNase H and IN activity. An especially preferable HIV-1 
DNA pol construct in a DNA molecule which contains at least one point mutation 
which alters the active site of the RT, RNase H and IN domains of Pol, such that each 
activity is at least substantially abolished. Such a HTV-1 Pol mutant will most likely 
comprise at least one point mutation in or around each catalytic domain responsible 
for RT, RNase H and IN activity, respectfully. To this end, an especially preferred 
HTV-1 DNA pol construct is exemplified herein and contains nine codon substitution 
mutations which results in an inactivated Pol protein (IA Pol: SEQ ID NO:4, Figure 
2A-C) which has no PR, RT, RNase or IN activity, wherein three such point 
mutations reside within each of the RT, RNase and IN catalytic domains. Therefore, 
an especially preferred exemplification is a DNA molecule which encodes IA-pol, 
which contains all nine mutations as shown below in Table 1. An additional preferred 
amino acid residue for substitution is Asp551, localized within the RNase domain of 
Pol. Any combination of the mutations disclosed herein may suitable and therefore 
may be utilized as an IA-Pol-based vaccine of the present invention. While addition 
and deletion mutations are contemplated and within the scope of the invention, the 
preferred mutation is a point mutation resulting in a substitution of the wild type 
amino acid with an alternative amino acid residue. 



Table 1 



25 



30 



wt aa 


aa residue 


mutant aa 


enzvme function 


Asp 


112 


Ala 


RT 


Asp 


187 


Ala 


RT 


Asp 


188 


Ala 


RT 
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It is preferred that point mutations be incorporated into the IApol mutant vaccines of 
the present invention so as to lessen the possibility of altering epitopes in and around 
the active site(s) of fflV-1 Pol. 

To this end, SEQ ID NO:3 discloses the nucleotide sequence which codes for 
5 a codon optimized pol in addition to the nine mutations shown in Table 1, disclosed as 
follows, and referred to herein as "IApol": 

AGATCTACCA TGGCCCCCAT CTCCCCCATT GAGACTGTGC CTGTGAAGCT GAAGCCTGGC 
ATGGATGGCC CCAAGGTGAA GCAGTGGCCC CTGACTGAGG AGAAGATCAA GGCCCTGGTG 
GAAATCTGCA CTGAGATGGA GAAGGAGGGC AAAATCTCCA AGATTGGCCC CGAGAACCCC 

10 TACAACACCC CTGTGTTTGC CATCAAGAAG AAGGACTCCA CCAAGTGGAG GAAGCTGGTG 
GACTTCAGGG AGCTGAACAA GAGGACCCAG GACTTCTGGG AGGTGCAGCT GGGCATCCCC 
CACCCCGCTG GCCTGAAGAA GAAGAAGTCT GTGACTGTGC TGGCTGTGGG GGATGCCTAC 
TTCTCTGTGC CCCTGGATGA GGACTTCAGG AAGTACACTG CCTTCACCAT CCCCTCCATC 
AACAATGAGA CCCCTGGCAT CAGGTACCAG TACAATGTGC TGCCCCAGGG CTGGAAGGGC 

15 TCCCCTGCCA TCTTCCAGTC CTCCATGACC AAGATCCTGG AGCCCTTCAG GAAGCAGAAC 
CCTGACATTG TGATCTACCA GTACATGGCT GCCCTGTATG TGGGCTCTGA CCTGGAGATT 
GGGCAGCACA GGACCAAGAT TGAGGAGCTG AGGCAGCACC TGCTGAGGTG GGGCCTGACC 
ACCCCTGACA AGAAGCACCA GAAGGAGCCC CCCTTCCTGT GGATGGGCTA TGAGCTGCAC 
CCCGACAAGT GGACTGTGCA GCCCATTGTG CTGCCTGAGA AGGACTCCTG GACTGTGAAT 

20 GACATCCAGA AGCTGGTGGG CAAGCTGAAC TGGGCCTCCC AAATCTACCC TGGCATCAAG 
GTGAGGCAGC TGTGCAAGCT GCTGAGGGGC ACCAAGGCCC TGACTGAGGT GATCCCCCTG 
ACTGAGGAGG CTGAGCTGGA GCTGGCTGAG AACAGGGAGA TCCTGAAGGA GCCTGTGCAT 
GGGGTGTACT ATGACCCCTC CAAGGACCTG ATTGCTGAGA TCCAGAAGCA GGGCCAGGGC 
CAGTGGACCT ACCAAATCTA CCAGGAGCCC TTCAAGAACC TGAAGACTGG CAAGTATGCC 

25 AGGATGAGGG GGGCCCACAC CAATGATGTG AAGCAGCTGA CTGAGGCTGT GCAGAAGATC 
ACCACTGAGT CCATTGTGAT CTGGGGCAAG ACCCCCAAGT TCAAGCTGCC CATCCAGAAG 
GAGACCTGGG AGACCTGGTG GACTGAGTAC TGGCAGGCCA CCTGGATCCC TGAGTGGGAG 
TTTGTGAACA CCCCCCCCCT GGTGAAGCTG TGGTACCAGC TGGAGAAGGA GCCCATTGTG 
GGGGCTGAGA CCTTCTATGT GGCTGGGGCT GCCAACAGGG AGACCAAGCT GGGCAAGGCT 

30 GGCTATGTGA CCAACAGGGG CAGGCAGAAG GTGGTGACCC TGACTGACAC CACCAACCAG 
AAGACTGCCC TCCAGGCCAT CTACCTGGCC CTCCAGGACT CTGGCCTGGA GGTGAACATT 
GTGACTGCCT CCCAGTATGC CCTGGGCATC ATCCAGGCCC AGCCTGATCA GTCTGAGTCT 
GAGCTGGTGA ACCAGATCAT TGAGCAGCTG ATCAAGAAGG AGAAGGTGTA CCTGGCCTGG 
GTGCCTGCCC ACAAGGGCAT TGGGGGCAAT GAGCAGGTGG ACAAGCTGGT GTCTGCTGGC 
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ATCAGGAAGG TGCTGTTCCT GGATGGCATT GACAAGGCCC AGGATGAGCA TGAGAAGTAC 
CACTCCAACT GGAGGGCTAT GGCCTCTGAC TTCAACCTGC CCCCTGTGGT GGCTAAGGAG 
ATTGTGGCCT CCTGTGACAA GTGCCAGCTG AAGGGGGAGG CCATGCATGG GCAGGTGGAC 
TGCTCCCCTG GCATCTGGCA GCTGGCCTGC ACCCACCTGG AGGGCAAGGT GATCCTGGTG 
5 GCTGTGCATG TGGCCTCCGG CTACATTGAG GCTGAGGTGA TCCCTGCTGA GACAGGCCAG 
GAGACTGCCT ACTTCCTGCT GAAGCTGGCT GGCAGGTGGC CTGTGAAGAC CATCCACACT 
GCCAATGGCT- CCAACTTCAC TGGGGCCACA GTGAGGGCTG CCTGCTGGTG GGCTGGCATC 
AAGCAGGAGT TTGGCATCCC CTACAACCCC CAGTCCCAGG GGGTGGTGGC CTCCATGAAC 
AAGGAGCTGA AGAAGATCAT TGGGCAGGTG AGGGACCAGG CTGAGCACCT GAAGACAGCT 

10 GTGCAGATGG CTGTGTTCAT CCACAACTTC AAGAGGAAGG GGGGCATCGG GGGCTACTCC 
GCTGGGGAGA GGATTGTGGA CATCATTGCC ACAGACATCC AGACCAAGGA GCTCCAGAAG 
CAGATCACCA AGATCCAGAA CTTCAGGGTG TACTACAGGG ACTCCAGGAA CCCCCTGTGG 
AAGGGCCCTG CCAAGCTGCT GTGGAAGGGG GAGGGGGCTG TGGTGATCCA GGACAACTCt 
GACATCAAGG TGGTGCCCAG GAGGAAGGCC AAGATCATCA GGGACTATGG CAAGCAGATG 

15 GCTGGGGATG ACTGTGTGGC CTCCAGGCAG GATGAGGACT AAAGCCCGGG CAGATCT (SEQ I 
N0:3) . 

In order to produce the IA-pol DNA vaccine construction, inactivation of the 
enzymatic functions was achieved by replacing a total of nine active-site residues 
from the enzyme subunits with alanine side-chains. As shown in Table 1, all residues 

20 that comprise the catalytic triad of the polymerase, namely Aspl 12, Aspl87, and 
Aspl88, were substituted with alanine (Ala) residues (Larder, et al., Nature 1987, 
327: 716-717; Larder, et al., 1989, Proc. Natl Acad. Sci. 1989, 86: 4803-4807). 
Three additional mutations were introduced at Asp445, Glu4S0 and Asp500 to abolish 
RNase H activity (Asp551 was left unchanged in this IA Pol construct), with each 

25 residue being substituted for an Ala residue, respectively (Davies, et al., 1991, 

Science 252:, 88-95; Schatz, et al., 1989, FEES Lett. 257: 311-314; Mizrahi, et ah, 
1990, Nucl. Acids. Res. 18: pp. 5359-5353). HIV pol integrase function was 
abolished through three mutations at Asp626, Asp678 and Glu714. Again, each of 
these residues has been substituted with an Ala residue (Wiskerchen, et al., 1995, J. 

30 Virol. 69: 376-386; Leavitt, et al., 1993, J. Biol. Chem. 268: 2113-2119). Amino 
acid residue Pro3 of SEQ ID NO:4 marks the start of the RT gene. The complete 
amino acid sequence of IA-Pol is disclosed herein as SEQ ID NO:4, as follows: 

Met Ala Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu Lys Pro 
Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr. Glu Glu Lys 
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lie Lys Ala Leu Val Glu He Cys 
He Ser Lys He Gly Pro Glu Asn 
He Lys Lys Lys Asp Ser Thr Lys 
Glu Leu Asn Lys Arg Thr Gin Asp 
5 Pro His Pro Ala Gly Leu Lys Lys 
Val Gly . Asp Ala Tyr Phe Ser Val 
Tyr Thr Ala Phe Thr He Pro Ser 
Arg Tyr Gin Tyr Asn Val Leu Pro 
He Phe Gin Ser Ser Met Thr Lys 

10 Asn Pro Asp He Val He Tyr Gin 
Ser Asp Leu Glu He Gly Gin His 
Gin His Leu Leu Arg Trp Gly Leu 
Lys Glu Pro Pro Phe Leu Trp Met 
Trp Thr Val Gin Pro He Val Leu 

15 Asn Asp He Gin Lys Leu Val Gly 
Tyr Pro Gly He Lys Val Arg Gin 
Lys Ala Leu Thr Glu Val He Pro 
Leu Ala Glu Asn Arg Glu He Leu 
Tyr Asp Pro Ser Lys Asp Leu He 

20 Gly Gin Trp Thr Tyr Gin He Tyr 
Thr Gly Lys Tyr Ala Arg Met Arg 
Gin Leu Thr Glu Ala Val Gin Lys 
Trp Gly Lys Thr Pro Lys Phe Lys 
Glu Thr Trp Trp Thr Glu Tyr Trp 

25 Glu Phe Val Asn Thr Pro Pro Leu 
Lys Glu Pro He Val Gly Ala Glu 
Asn Arg Glu Thr Lys Leu Gly Lys 
Arg Gin Lys Val Val Thr Leu Thr 
Leu Gin Ala He Tyr Leu Ala Leu 

30 He Val Thr Ala Ser Gin Tyr Ala 
Asp Gin Ser Glu Ser Glu Leu Val 
Lys Lys Glu Lys Val Tyr Leu Ala 
Gly Gly Asn Glu Gin Val Asp Lys 
Val Leu Phe Leu Asp Gly He Asp 
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TT 1 

Val 


He 


Leu 


Pro 


lie 


Gin 


m* 

Lys 


Glu 


Thr 


Trp 


Gin 


Ala 


Thr 


Trp 


He 


Pro 


Glu 


Trp 


Val 


Lys 


Leu 


Trp 


Tyr 


Gin 


Leu 


Glu 


Thr 


Phe 


Tyr 


Val 


Ala 


Gly 


Ala 


Ala 


Ala 


Gly 


Tyr 


Val 


Thr 


Asn 


Arg 


Gly 


Asp 


Thr 


Thr 


Asn 


Gin 


Lys 


Thr 


Ala 


Gin 


Asp 


Ser 


Gly 


Leu 


Glu 


Val 


Asn 


Leu 


Gly 


lie 


He 


Gin 


Ala 


Gin 


Pro 


Asn 


Gin 


He 


He 


Glu 


Gin 


Leu 


He 


Trp 


Val 


Pro 


Ala 


His 


Lys 


Gly 


He 


Leu 


Val 


Ser 


Ala 


Gly 


He 


Arg 


Lys 


Lys 


Ala 


Gin 


Asp 


Glu 


His 


Glu 


Lys 
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Tyr His 


Ser 


Asn 


Trp Arg Ala 


Met 


Ala 


Ser 


Asp 


Phe 


Asn 


Leu 


Pro 


Pro 


Val Val 


Ala 


Lys 


Glu 


He Val 


Ala 


Ser 


Cys 


Asp 


Lys 


Cys 


Gin 


Leu 


Lys 


Gly Glu Ala 


Met 


His 


Gly Gin 


Val 


Asp 


Cys 


Ser 


Pro 


Gly 


He 


Trp 


Gin 


Leu Ala 


Cys 


Thr 


His 


Leu Glu 


Gly 


Lys 


Val. 


He 


Leu 


Val 


Ala 


Val 


His 


Val Ala 


Ser 


Gly Tyr 


He Glu 


Ala 


Glu 


Val 


He 


Pro 


Ala 


Glu 


Thr 


Gly 


Gin Glu 


Thr 


Ala 


Tyr 


Phe Leu 


Leu 


Lys 


Leu 


Ala 


Gly 


Arg 


Trp 


Pro 


Val 


Lys Thr 


He 


His 


Thr 


Ala Asn 


Gly 


Ser 


Asn 


Phe 


Thr 


Gly Ala Thr 


Val 


Arg Ala 


Ala 


Cys 


Trp 


Trp Ala 


Gly 


He 


Lys 


Gin 


Glu 


Phe 


Gly 


He 


Pro 


Tyr Asn 


Pro 


Gin Ser Gin Gly 


Val 


Val 


Ala 


Ser 


Met 


Asn 


Lys 


Glu 


T ai i 

JucU 


Lys Lys 


He 


He 


Gly Gin Val 


Arg 


Asp 


Gin 


Ala 


Glu 


His 


Leu 


Lys 


Thr 


Ala Val 

■ 


Gin 


Met 


Ala 


Val Phe 


He 


His 


Asn 


Phe 


Lys 


Arg 


Lys 


Gly 


Gly 


lie Gly Gly 


Tyr 


Ser Ala Gly 


Glu 


Arg 


He 


Val 


Asp 


He 


He 


Ala 


Thr 


Asp lie 


Gin 


Thr 


Lys 


Glu Leu 


Gin 


Lys 


Gin 


He 


Thr 


Lys 


He 


Gin 


Asn 


Phe Arg Val 


Tyr 


Tyr 


Arg Asp 


Ser 


Arg 


Asn 


Pro 


Leu 


Trp 


Lys 


Gly 


Pro 


Ala Lys 


Leu 


Leu 


Trp 


Lys Gly 


Glu 


Gly 


Ala 


Val 


Val 


He Gin Asp 


Asn 


Ser Asp 


He 


Lys 


Val 


Val Pro 


Arg 


Arg 


Lys 


Ala 


Lys 


He 


He 


Arg 


Asp 


Tyr Gly 


Lys 


Gin 


Met 


Ala Gly 


Asp 


Asp 


Cys 


Val 


Ala 


Ser 


Arg 


Gin 


Asp 



Glu Asp (SEQ ID N0:4) . 

As noted above, it will be understood that any combination of the mutations 
disclosed above may be suitable and therefore be utilized as an IA-pol-based vaccine 
of the present invention. For example, it may be possible to mutate only 2 of the 3 
residues within the respective reverse transcriptase, RNase H, and integrase coding 
regions while still abolishing these enzymatic activities. However, the IA-pol 
construct described above and disclosed as SEQ ID NO:3, as well as the expressed 
protein (SEQ ID NO:4) is preferred. It is also preferred that at least one mutation be 
present in each of the three catalytic domains. 

Another aspect of the present invention is to generate codon optimized HTV-1 
Pol-based vaccine constructions which comprise a eukaryotic trafficking signal 
peptide such as from tPA (tissue-type plasminogen activator) or by a leader peptide 
such as is found in highly expressed mammalian proteins such as immunoglobulin 
leader peptides. Any functional leader peptide may be tested for efficacy. However, 
a preferred embodiment of the present invention is to provide for HTV-1 Pol mutant 
vaccine constructions as disclosed herein which also comprise a leader peptide, 
preferably a leader peptide from human tPA. In other words, a codon optimized 
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HIV-1 Pol mutant such as IA-Pol (SEQ ID NO:4) may also comprise a leader peptide 
at the amino terminal portion of the protein, which may effect cellular trafficking and 
hence, immunogenicity of the expressed protein within the host cell. As shown in 
Figure 1 A-B for the DNA vector VI Jns, a DNA vector which may be utilized to 
5 practice the present invention may be modified by known recombinant DNA 
methodology to contain a leader signal peptide of interest, such that downstream 
cloning of the modified HTV-1 protein of interest results in a nucleotide sequence 
which encodes a modified HIV-1 tP A/Pol protein. In the alternative, as noted above, 
insertion of a nucleotide sequence which encodes a leader peptide may be inserted 

10 into a DNA vector housing the open reading frame for the Pol protein of interest. 
Regardless of the cloning strategy, the end result is a polynucleotide vaccine which 
comprises vector components for effective gene expression in conjunction with 
nucleotide sequences which encode a modified HTV-1 Pol protein of interest, 
including but not limited to a HTV-1 Pol protein which contains a leader peptide. The 

15 amino acid sequence of the human tPA leader utilized herein is as follows: 

MDAMKRGLCCVLLLCGAVFVSPSEISS (SEQ ID NO:28). Therefore, another 
aspect of the present invention is to generate HIV-1 Pol-based vaccine constructions 
which comprise a eukaryotic trafficking signal peptide such as from tPA. To this end, 
the present invention relates to a DNA molecule which encodes a codon optimized 

20 wt-pol DNA construct wherein the protease (PR) activity is deleted and a human tPA 
leader sequence is fused to the 5' end of the coding region. A DNA molecule which - 
encodes this protein is disclosed herein as SEQ ID NO:5, the open reading frame 
disclosed herein as SEQ ID NO:6. 

To this end, the present invention relates to a DNA molecule which encodes a 

25 codon optimized wt-pol DNA construct wherein the protease (PR) activity is deleted 
and a human tPA leader sequence is fused to the 5' end of the coding region ( herein, 
"tPA-wt-pol"). A DNA molecule which encodes this protein is disclosed herein as 
SEQ ID NO:5, the open reading frame being contained from an initiating Met residue 
at nucleotides 8-10 to a termination codon from nucleotides 2633-2635. SEQ ID 

30 NO:5 is as follows: 

GATCACCATG GATGCAATGA AGAGAGGGCT CTGCTGTGTG CTGCTGCTGT GTGGAGCAGT 
CTTCGTTTCG CCCAGCGAGA TCTCCGCCCC CATCTCCCCC ATTGAGACTG TGCCTGTGAA 
GCTGAAGCCT GGCATGGATG GCCCCAAGGT GAAGCAGTGG CCCCTGACTG AGGAGAAGAT 
CAAGGCCCTG GTGGAAATCT GCACTGAGAT GGAGAAGGAG GGCAAAATCT CCAAGATTGG 



-22- 



WO 01/45748 



PCT/US00/34724 



CCCCGAGAAC CCCTACAACA CCCCTGTGTT TGCCATCAAG AAGAAGGACT CCACCAAGTG 
GAGGAAGCTG GTGGACTTCA GGGAGCTGAA CAAGAGGACC CAGGACTTCT GGGAGGTGCA 
GCTGGGCATC CCCCACCCCG CTGGCCTGAA GAAGAAGAAG TCTGTGACTG TGCTGGATGT 
GGGGGATGCC TACTTCTCTG TGCCCCTGGA TGAGGACTTC AGGAAGTACA CTGCCTTCAC 
5 CATCCCCTCC ATCAACAATG AGACCCCTGG CATCAGGTAC CAGTACAATG TGCTGCCCCA 
GGGCTGGAAG GGCTCCCCTG CCATCTTCCA GTCCTCCATG ACCAAGATCC TGGAGCCCTT 
CAGGAAGCAG AACCCTGACA TTGTGATCTA CCAGTACATG .GATGACCTGT ATGTGGGCTC 
TGACCTGGAG ATTGGGCAGC ACAGGACCAA GATTGAGGAG CTGAGGCAGC ACCTGCTGAG 
GTGGGGCCTG ACCACCCCTG ACAAGAAGCA CCAGAAGGAG CCCCCCTTCC TGTGGATGGG 
10 CTATGAGCTG CACCCCGACA AGTGGACTGT GCAGCCCATT GTGCTGCCTG AGAAGGACTC 
CTGGACTGTG AATGACATCC AGAAGCTGGT GGGCAAGCTG AACTGGGCCT CCCAAATCTA 
CCCTGGCATC AAGGTGAGGC AGCTGTGCAA GCTGCTGAGG GGCACCAAGG CCCTGACTGA 
GGTGATCCCC CTGACTGAGG AGGCTGAGCT GGAGCTGGCT GAGAACAGGG AGATCCTGAA 
GGAGCCTGTG CATGGGGTGT ACTATGACCC CTCCAAGGAC CTGATTGCTG AGATCCAGAA 
15 GCAGGGCCAG GGCCAGTGGA CCTACCAAAT CTACCAGGAG CCCTTCAAGA ACCTGAAGAC 
TGGCAAGTAT GCCAGGATGA GGGGGGCCCA CACCAATGAT GTGAAGCAGC TGACTGAGGC 
TGTGCAGAAG ATCACCACTG AGTCCATTGT GATCTGGGGC AAGACCCCCA AGTTCAAGCT 
GCCCATCCAG AAGGAGACCT GGGAGACCTG GTGGACTGAG TACTGGCAGG CCACCTGGAT 
CCCTGAGTGG GAGTTTGTGA AGACCCCCCC CCTGGTGAAG CTGTGGTACC AGCTGGAGAA 
20 GGAGCCCATT GTGGGGGCTG AGACCTTCTA TGTGGATGGG GCTGCCAACA GGGAGACCAA 
GCTGGGCAAG GCTGGCTATG TGACCAACAG GGGCAGGCAG AAGGTGGTGA CCCTGACTGA 
CACCACCAAC CAGAAGACTG AGCTCCAGGC CATCTACCTG GCCCTCCAGG ACTCTGGCCT 
GGAGGTGAAC ATTGTGACTG ACTCCCAGTA TGCCCTGGGC ATCATCCAGG CCCAGCCTGA 
TCAGTCTGAG TCTGAGCTGG TGAACCAGAT CATTGAGCAG CTGATCAAGA AGGAGAAGGT 
25 GTACCTGGCC TGGGTGCCTG CCCACAAGGG CATTGGGGGC AATGAGCAGG TGGACAAGCT 
GGTGTCTGCT GGCATCAGGA AGGTGCTGTT CCTGGATGGC ATTGACAAGG CCCAGGATGA 
GCATGAGAAG TACCACTCCA ACTGGAGGGC TATGGCCTCT GACTTCAACC TGCCCCCTGT 
GGTGGCTAAG GAGATTGTGG CCTCCTGTGA CAAGTGCCAG CTGAAGGGGG AGGCCATGCA 
TGGGCAGGTG GACTGCTCCC CTGGCATCTG GCAGCTGGAC TGCACCCACC TGGAGGGCAA 
30 GGTGATCCTG GTGGCTGTGC ATGTGGCCTC CGGCTACATT GAGGCTGAGG TGATCCCTGC ' 
TGAGACAGGC CAGGAGACTG CCTACTTCCT GCTGAAGCTG GCTGGCAGGT GGCCTGTGAA 
GACCATCCAC ACTGACAATG GCTCCAACTT CACTGGGGCC ACAGTGAGGG CTGCCTGCTG 
GTGGGCTGGC ATCAAGCAGG AGTTTGGCAT CCCCTACAAC CCCCAGTCCC AGGGGGTGGT 
GGAGTCCATG AACAAGGAGC TGAAGAAGAT CATTGGGCAG GTGAGGGACC AGGCTGAGCA 
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CCTGAAGACA GCTGTGCAGA TGGCTGTGTT CATCCACAAC . TTCAAGAGGA AGGGGGGCAT 
CGGGGGCTAC TCCGCTGGGG AGAGGATTGT GGACATCATT GCCACAGACA TCCAGACCAA 
GGAGCTCCAG AAGCAGATCA CCAAGATCCA GAACTTCAGG GTGTACTACA GGGACTCCAG 
GAACCCCCTG TGGAAGGGCC CTGCCAAGCT GCTGTGGAAG GGGGAGGGGG CTGTGGTGAT 
5 CCAGGACAAC TCTGACATCA AGGTGGTGCC CAGGAGGAAG GCCAAGATCA TCAGGGACTA 
TGGCAAGCAG ATGGCTGGGG ATGACTGTGT GGCCTCCAGG CAGGATGAGG ACTAAAGCCC 
GGGCAGATCT ( SEQ ID NO : 5 ) . 

The open reading frame of the wild type tPA-pol construct disclosed as SEQ 
ID NO:5 contains 875 amino acids, disclosed herein as SEQ ID NO:6, as follows: 

10 Met Asp Ala Met Lys Arg Gly Leu Cys Cys Val Leu Leu Leu Cys Gly 
Ala Val Phe Val Ser Pro Ser Glu He Ser Ala Pro He Ser Pro He 
• Glu Thr Val Pro Val Lys Leu Lys Pro Gly Met Asp Gly Pro Lys Val 
Lys Gin Trp Pro Leu Thr Glu Glu Lys He Lys Ala Leu Val Glu lie 
Cys Thr Glu Met Glu Lys Glu Gly Lys He Ser Lys He Gly Pro Glu 

15 Asn Pro Tyr Asn Thr Pro Val Phe Ala He Lys Lys Lys Asp Ser Thr 
Lys Trp Arg Lys Leu Val Asp Phe Arg Glu Leu Asn Lys Arg Thr Gin 
Asp Phe Trp Glu Val Gin Leu Gly He Pro His Pro Ala Gly Leu Lys 
Lys Lys Lys Ser Val Thr Val Leu Asp Val Gly Asp Ala Tyr Phe Ser 
Val Pro Leu Asp Glu Asp Phe Arg Lys Tyr Thr Ala Phe Thr He Pro 

20 Ser He Asn Asn Glu Thr Pro Gly He Arg Tyr Gin Tyr Asn Val Leu 
Pro Gin Gly Trp Lys Gly Ser Pro Ala He Phe Gin Ser Ser Met Thr 
Lys He Leu Glu Pro Phe Arg Lys Gin Asn Pro Asp He Val He Tyr 
Gin Tyr Met Asp Asp Leu Tyr Val Gly Ser Asp Leu Glu He Gly Gin 
His Arg Thr Lys He Glu Glu Leu Arg Gin His Leu Leu Arg Trp Gly 

25 Leu Thr Thr Pro Asp Lys Lys His Gin Lys Glu Pro Pro Phe Leu Trp 
Met Gly Tyr Glu Leu His Pro Asp Lys Trp Thr Val Gin Pro He Val 
Leu Pro Glu Lys Asp Ser Trp Thr Val Asn Asp He Gin Lys Leu Val 
Gly Lys Leu Asn Trp Ala Ser Gin He Tyr Pro Gly He Lys Val Arg 
Gin Leu Cys Lys Leu Leu Arg Gly Thr Lys Ala Leu Thr Glu Val He 

30 Pro Leu Thr Glu Glu Ala Glu Leu Glu Leu Ala Glu Asn Arg Glu He 
Leu Lys Glu Pro Val His Gly Val Tyr Tyr Asp Pro Ser Lys Asp Leu 
He Ala Glu He Gin Lys Gin Gly Gin Gly Gin Trp Thr Tyr Gin He 
Tyr Gin Glu Pro Phe Lys Asn Leu Lys Thr Gly Lys Tyr Ala Arg Met 
Arg Gly Ala His Thr Asn Asp Val Lys Gin Leu Thr Glu Ala Val Gin 
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Lys 


He 


Thr 


Thr 


Glu 


Ser He Val He Trp Gly Lys 


Thr 


Pro 


Lys 


Phe 




Lys 


Leu 


Pro 


He 


Gin 


Lys Glu Thr Trp Glu Thr Trp 


Trp 


Thr 


Glu 


Tyr 




Trp 


Gin 


Ala 


Thr Trp 


He Pro Glu Trp Glu Phe Val 


Asn 


Thr 


Pro 


Pro 




Leu 


Val 


Lys 


Leu 


Trp 


Tyr Gin Leu Glu Lys Glu Pro 


He 


Val 


Gly 


Ala 


5 


Glu 


Thr 


Phe 


Tyr 


Val 


Asp Gly Ala Ala Asn Arg Glu 


Thr 


Lys 


Leu 


Gly 




Lys 


Ala 


Gly Tyr Val 


Thr Asn Arg Gly Arg Gin Lys 


Val 


Val 


Thr 


Leu 




Thr 


Asp 


Thr 


Thr 


Asn 


Gin Lys Thr Glu Leu Gin Ala 


He 


Tyr 


Leu 


Ala 




Leu 


Gin 


Asp 


Ser Gly 


Leu Glu Val Asn He Val Thr 


Asp 


Ser 


Gin 


Tyr 




Ala 


Leu 


Gly 


lie 


He 


Gin Ala Gin Pro Asp Gin Ser 


Glu 


Ser 


Glu 


Leu 


10 


Val 


Asn 


Gin 


He 


He 


Glu Gin Leu lie Lys Lys Glu 


Lys 


Val 


Tyr 


Leu 




Ala 


Trp 


Val 


Pro 


Ala 


His Lys Gly He Gly Gly Asn 


Glu 

* 


Gin 


Val 


Asp 




Lys 


Leu 


Val 


Ser Ala 


Gly He Arg Lys Val Leu Phe 


Leu 


Asp Gly 


He 




Asp 


Lys 


Ala Gin Asp 


Glu His Glu Lys Tyr His Ser 


Asn 


Trp Arg 


Ala 




Met 


Ala 


Ser 


Asp 


Phe 


Asn Leu Pro Pro Val Val Ala 


Lys 


Glu 


He 


Val 


15 


Ala 


Ser 


Cys 


Asp 


Lys 


Cys Gin Leu Lys Gly Glu Ala 


Met 


His 


Gly 


Gin 




Val 


Asp 


Cys 


Ser 


Pro 


Gly He Trp Gin Leu Asp Cys 


Thr 


His 


Leu 


Glu 




Gly 


Lys 


Val 


He 


Leu 


Val Ala Val His Val Ala Ser 


Gly 


Tyr 


He 


Glu 




Ala 


Glu 


Val 


He 


Pro 


Ala Glu Thr Gly Gin Glu Thr 


Ala 


Tyr 


Phe 


Leu 




Leu 


Lys 


Leu Ala Gly 


Arg Trp Pro Val Lys Thr He 


His 


Thr 


Asp 


Asn 


20 


Gly 


Ser 


Asn 


Phe 


Thr 


Gly Ala Thr Val Arg Ala Ala 


Cys 


Trp Trp 


Ala 


- 


Gly 


He 


Lys 


Gin 


Glu 


Phe Gly He Pro Tyr Asn Pro 


Gin 


Ser 


Gin 


Gly 




Val 


Val 


Glu 


Ser 


Met 


Asn Lys Glu Leu Lys Lys He 


He 


Gly Gin 


Val 




Arg 


Asp 


Gin 


Ala 


Glu 


His Leu Lys Thr Ala Val Gin 


Met 


Ala 


Val 


Phe 




He 


His 


Asn 


Phe 


Lys 


Arg Lys Gly Gly He Gly Gly 


Tyr 


Ser 


Ala 


Gly 


25 


Glu 


Arg 


He 


Val 


Asp 


He He Ala Thr Asp He Gin 


Thr 


Lys 


Glu 


Leu 




Gin 


Lys 


Gin 


He 


Thr 


Lys He Gin Asn Phe Arg Val 


Tyr 


Tyr Arg 


Asp 




Ser 


Arg 


Asn 


Pro 


Leu 


Trp Lys Gly Pro Ala Lys Leu 


Leu 


Trp 


Lys 


Gly 




Glu 


Gly 


Ala 


Val 


Val 


He Gin Asp Asn Ser Asp He 


Lys 


Val- Val 


Pro 




Arg 


Arg 


Lys 


Ala 


Lys 


He He Arg Asp Tyr Gly Lys 


Gin 


Met 


Ala 


Gly 


30 


Asp 


Asp 


Cys 


Val 


Ala 


Ser Arg Gin Asp Glu Asp (SEQ ID 


NO: 6) . 





The present invention also relates to a codon optimized HTV-1 Pol mutant such 
as IA-Pol (SEQ ID NO:4) which comprises a leader peptide at the amino terminal 
portion of the protein, which may effect cellular trafficking and hence, 
immunogenicity of the expressed protein within the host cell. Any such HTV-1 DNA 
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pol mutant disclosed in the above paragraphs is suitable for fusion downstream of a 
leader peptide, such as a leader peptide including but not limited to the human tPA 
leader sequence. Therefore, any such leader peptide-based HTV-1 pol mutant 
construct may include but is not limited to a mutated DNA molecule which effectively 
5 alters the catalytic activity of the RT, RNase and/or IN region of the expressed protein, 
resulting in at least substantially decreased enzymatic activity one or more of the RT, 
RNase H and/or IN functions of HTV-1 Pol. In a preferred embodiment of this portion 
of the invention, a leader peptide/HTV-1 DNA pol construct contains a mutation or 
mutations within the Pol coding region which effectively abolishes RT, RNase H and 

10 IN activity. An especially preferable HTV-1 DNA pol construct is a DNA molecule 
which contains at least one point mutation which alters the active site and catalytic 
activity within the RT, RNase H and IN domains of Pol, such that each activity is at 
least substantially abolished, and preferably totally abolished. Such a HIV-1 Pol 
mutant will most likely comprise at least one point mutation in or around each 

15 catalytic domain responsible for RT, RNase H and IN activity, respectfully. An ■ 
especially preferred embodiment of this portion of the invention relates to a human 
tPA leader fused to the IA-Pol protein comprising the nine mutations shown in Table 
1. The DNA molecule is disclosed herein as SEQ ID NO:7 and the expressed tPA-IA 
Pol protein comprises a fusion junction as shown in Figure 3. The complete amino 

20 acid sequence of the expressed protein is set forth in SEQ ID NO: 8. To this end, SEQ 
ID NO:7 discloses the nucleotide sequence which codes for a human tPA leader fused 
to the IA Pol protein comprising the nine mutations shown in Table 1 (herein, "tPA- 
opt-IApol"). The open reading frame begins with the initiating Met (nucleotides 8-10) 
and terminates with a "TAA" codon at nucleotides 2633-2635. The nucleotide 

25 sequence encoding tPA-IAPol is also disclosed as follows: 



GATCACCATG 


GATGCAATGA AGAGAGGGCT 


CTGCTGTGTG 


CTGCTGCTGT 


GTGGAGCAGT 


CTTCGTTTCG 


CCCAGCGAGA 


TCTCCGCCCC 


CATCTCCCCC 


ATTGAGACTG 


TGCCTGTGAA 


GCTGAAGCCT 


GGCATGGATG 


GCCCCAAGGT 


GAAGCAGTGG 


CCCCTGACTG 


AGGAGAAGAT 


CAAGGCCCTG 


GTGGAAATCT 


GCACTGAGAT 


GGAGAAGGAG 


GGC AAAATCT 


CCAAGATTGG 


CCCCGAGAAC 


CCCTACAACA 


CCCCTGTGTT 


TGCCATCAAG 


AAGAAGGACT 


CCACCAAGTG 


GAGGAAGCTG 


GTGGACTTCA 


GGGAGCTGAA 


CAAGAGGACC 


CAGGACTTCT 


GGGAGGTGCA 


GCTGGGCATC 


CCCCACCCCG 


CTGGCCTGAA 


GAAGAAGAAG 


TCTGTGACTG 


TGCTGGCTGT 


GGGGGATGCC 


TACTTCTCTG 


TGCCCCTGGA 


TGAGGACTTC 


AGGAAGTACA 


CTGCCTTCAC 


CATCCCCTCC 


ATCAACAATG 


AGACCCCTGG 


CATCAGGTAC 


CAGTACAATG 


TGCTGCCCCA 
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10 



15 



20 



25 



30 



GGGCTGGAAG GGCTCCCCTG CCATCTTCCA GTCCTCCATG ACCAAGATCC TGGAGCCCTT 
CAGGAAGCAG AACCCTGACA TTGTGATCTA CCAGTACATG GCTGCCCTGT ATGTGGGCTC 
TGACCTGGAG ATTGGGCAGC ACAGGACCAA GATTGAGGAG CTGAGGCAGC ACCTGCTGAG 
GTGGGGCCTG ACCACCCCTG ACAAGAAGCA CCAGAAGGAG CCCCCCTTCC TGTGGATGGG 
CTATGAGCTG CACCCCGACA AGTGGACTGT GCAGCCCATT GTGCTGCCTG AGAAGGACTC 
CTGGACTGTG AATGACATCC AGAAGCTGGT GGGCAAGCTG AACTGGGCCT CCCAAATCTA 
CCCTGGCATC AAGGTGAGGC AGCTGTGCAA GCTGCTGAGG GGCACCAAGG CCCTGACTGA 
GGTGATCCCC CTGACTGAGG AGGCTGAGCT GGAGCTGGCT GAGAACAGGG AGATCCTGAA 
GGAGCCTGTG CATGGGGTGT ACTATGACCC CTCCAAGGAC CTGATTGCTG AGATCCAGAA . 
GCAGGGCCAG GGCCAGTGGA CCTACCAAAT CTACCAGGAG CCCTTCAAGA ACCTGAAGAC 
TGGCAAGTAT GCCAGGATGA GGGGGGCCCA CACCAATGAT GTGAAGCAGC TGACTGAGGC 
TGTGCAGAAG ATCACCACTG AGTCCATTGT GATCTGGGGC AAGACCCCCA AGTTCAAGCT 
GCCCATCCAG AAGGAGACCT GGGAGACCTG GTGGACTGAG TACTGGCAGG CCACCTGGAT 
CCCTGAGTGG GAGTTTGTGA ACACCCCCCG CCTGGTGAAG CTGTGGTACC AGCTGGAGAA . 
GGAGCCCATT GTGGGGGCTG AGACCTTCTA TGTGGCTGGG GCTGCCAACA GGGAGACCAA 
GCTGGGCAAG GCTGGCTATG TGACCAACAG GGGCAGGCAG AAGGTGGTGA CCCTGACTGA 
CACCACCAAC CAGAAGACTG CCCTCCAGGC CATCTACCTG GCCCTCCAGG ACTCTGGCCT 
GGAGGTGAAC ATTGTGACTG CCTCCCAGTA TGCCCTGGGC ATCATCCAGG CCCAGCCTGA 
TCAGTCTGAG TCTGAGCTGG TGAACCAGAT CATTGAGCAG CTGATCAAGA AGGAGAAGGT 
GTACCTGGCC TGGGTGCCTG CCCACAAGGG CATTGGGGGC AATGAGCAGG TGGACAAGCT 
GGTGTCTGCT. GGCATCAGGA AGGTGCTGTT CCTGGATGGC ATTGACAAGG CCCAGGATGA 
GCATGAGAAG TACCACTCCA ACTGGAGGGC TATGGCCTCT GACTTCAACC TGCCCCCTGT 
GGTGGCTAAG GAGATTGTGG CCTCCTGTGA CAAGTGCCAG CTGAAGGGGG AGGCCATGCA 
TGGGCAGGTG GACTGCTCCC CTGGCATCTG GCAGCTGGCC TGCACCCACC TGGAGGGCAA 
GGTGATCCTG GTGGCTGTGC ATGTGGCCTC CGGCTACATT GAGGCTGAGG TGATCCCTGC 
TGAGACAGGC CAGGAGACTG CCTACTTCCT GCTGAAGCTG GCTGGCAGGT GGCCTGTGAA 
GACCATCCAC ACTGCCAATG GCTCCAACTT CACTGGGGCC ACAGTGAGGG CTGCCTGCTG 
GTGGGCTGGC ATCAAGCAGG AGTTTGGCAT CCCCTACAAC CCCCAGTCCC AGGGGGTGGT 
GGCCTCCATG AACAAGGAGC TGAAGAAGAT CATTGGGCAG GTGAGGGACC AGGCTGAGCA 
CCTGAAGACA GCTGTGCAGA TGGCTGTGTT CATCCACAAC TTCAAGAGGA AGGGGGGCAT 
CGGGGGCTAC TCCGCTGGGG AGAGGATTGT GGACATCATT GCCACAGACA TCCAGACCAA 
GGAGCTCCAG AAGCAGATCA CCAAGATCCA GAACTTCAGG GTGTACTACA GGGACTCCAG 
GAACCCCCTG TGGAAGGGCC CTGCCAAGCT GCTGTGGAAG GGGGAGGGGG CTGTGGTGAT 
CCAGGACAAC TCTGACATCA AGGTGGTGCC CAGGAGGAAG GCCAAGATCA TCAGGGACTA 
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TGGCAAGCAG ATGGCTGGGG ATGACTGTGT GGCCTCCAGG CAGGATGAGG ACTAAAGCCC 
GGGCAGATCT (SEQ ID NO: 7) . 

The open reading frame of the tPA-IA-pol construct disclosed as SEQ ID 
NO:7 contains 875 amino acids, disclosed herein as tPA-IA-Pol and SEQ ID NO:8, as 
5 follows: 

Met Asp Ala Met Lys Arg' Gly Leu Cys Cys Val Leu Leu Leu Cys Gly 
Ala Val Phe Val Ser Pro Ser Glu lie Ser Ala Pro lie Ser Pro lie 
Glu Thr Val Pro Val Lys Leu Lys Pro Gly Met Asp Gly Pro Lys Val 
Lys Gin Trp Pro Leu Thr Glu Glu Lys lie Lys Ala Leu Val Glu lie 

10 Cys Thr Glu Met Glu Lys Glu Gly Lys lie Ser Lys lie Gly Pro Glu 
Asn Pro Tyr Asn Thr Pro Val Phe Ala lie Lys Lys Lys Asp Ser Thr 
Lys Trp Arg Lys Leu Val Asp Phe Arg Glu Leu Asn Lys Arg Thr Gin 
Asp Phe Trp Glu Val Gin Leu Gly lie Pro His Pro Ala Gly Leu Lys 
Lys Lys Lys Ser Val Thr Val Leu Ala Val Gly Asp Ala Tyr Phe Ser 

15 Val Pro Leu Asp Glu Asp Phe Arg Lys Tyr Thr Ala Phe Thr lie Pro 
Ser lie Asn Asn Glu Thr Pro Gly lie Arg Tyr Gin Tyr Asn Val Leu 
Pro Gin Gly Trp Lys Gly Ser Pro Ala lie Phe Gin Ser Ser Met Thr 
Lys lie Leu Glu Pro Phe Arg Lys Gin Asn Pro Asp lie Val lie Tyr 
Gin Tyr Met Ala Ala Leu Tyr Val Gly Ser Asp Leu Glu lie Gly Gin 

20 His Arg Thr Lys lie Glu Glu Leu Arg Gin His Leu Leu Arg Trp Gly 
Leu Thr Thr Pro Asp Lys Lys His Gin Lys Glu Pro Pro Phe Leu Trp 
Met Gly Tyr Glu Leu His Pro Asp Lys Trp Thr Val Gin Pro lie Val 
Leu Pro Glu Lys Asp Ser Trp Thr Val Asn Asp He Gin Lys Leu Val 
Gly Lys Leu Asn Trp Ala Ser Gin He Tyr. Pro Gly He Lys Val Arg 

25 Gin Leu Cys Lys Leu Leu Arg Gly Thr Lys Ala Leu Thr Glu Val He 
Pro Leu Thr Glu Glu Ala Glu Leu Glu Leu Ala Glu Asn Arg Glu He 
Leu Lys Glu Pro Val His Gly Val Tyr Tyr Asp Pro Ser Lys Asp Leu 
He Ala Glu He Gin Lys Gin Gly Gin Gly Gin Trp Thr Tyr Gin He 
Tyr Gin Glu Pro Phe Lys Asn Leu Lys Thr Gly Lys Tyr Ala Arg Met 

30 Arg Gly Ala His Thr Asn Asp Val Lys Gin Leu Thr Glu Ala Val Gin 
Lys He Thr Thr Glu Ser He Val He Trp Gly Lys Thr Pro Lys Phe 
Lys Leu Pro He Gin Lys Glu Thr Trp Glu Thr Trp Trp Thr Glu Tyr 
Trp Gin Ala Thr Trp He Pro Glu Trp Glu Phe Val Asn Thr Pro Pro 
Leu Val Lys Leu Trp Tyr Gin Leu Glu Lys Glu Pro He Val Gly Ala 
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Glu Thr Phe Tyr Val Ala Gly Ala Ala. Asn Arg Glu Thr Lys Leu Gly 
Lys Ala Gly Tyr Val Thr Asn Arg Gly Arg Gin Lys Val Val Thr Leu 
Thr Asp Thr Thr Asn Gin Lys Thr Ala Leu Gin Ala lie Tyr Leu Ala 
Leu Gin Asp Ser Gly Leu Glu Val Asn He Val Thr Ala Ser Gin Tyr 
5 Ala Leu Gly He He Gin Ala Gin Pro Asp Gin Ser Glu Ser Glu Leu 
Val Asn Gin He He Glu Gin Leu He Lys Lys Glu Lys Val Tyr Leu 
Ala Trp Val Pro Ala His Lys Gly He Gly Gly Asn Glu Gin Val Asp 
Lys Leu Val Ser Ala Gly lie Arg Lys Val Leu Phe Leu Asp Gly He 
Asp Lys Ala Gin Asp Glu His Glu Lys Tyr His Ser Asn Trp Arg Ala 
10 Met Ala Ser Asp Phe Asn Leu Pro Pro Val Val Ala Lys Glu He Val 
Ala Ser Cys Asp Lys Cys Gin Leu Lys Gly Glu Ala Met His Gly Gin 
Val Asp Cys Ser Pro Gly He Trp Gin Leu Ala Cys Thr His Leu Glu 
Gly Lys Val He Leu Val Ala Val His Val Ala Ser Gly Tyr He Glu 
Ala Glu Val He Pro Ala Glu Thr Gly Gin Glu Thr Ala Tyr Phe Leu 
15 Leu Lys Leu Ala Gly Arg Trp Pro Val Lys Thr. He His Thr Ala Asn 
Gly Ser Asn Phe Thr Gly Ala Thr Val Arg Ala Ala Cys Trp Trp Ala 
Gly He Lys Gin Glu Phe Gly He Pro Tyr Asn Pro Gin Ser Gin Gly 
Val Val Ala Ser Met Asn Lys Glu Leu Lys Lys lie lie Gly Gin Val 
Arg Asp Gin Ala Glu His Leu Lys Thr Ala Val Gin Met Ala Val Phe 
20 lie His Asn Phe Lys Arg Lys Gly Gly lie Gly Gly Tyr Ser Ala Gly 
Glu Arg lie Val Asp He lie Ala Thr Asp lie Gin Thr Lys Glu Leu 
Gin Lys Gin He Thr Lys lie Gin Asn Phe Arg Val Tyr Tyr Arg Asp 
Ser Arg Asn Pro Leu Trp Lys Gly Pro Ala Lys Leu Leu Trp' Lys Gly 
Glu Gly Ala Val Val lie Gin Asp Asn Ser Asp He Lys Val Val Pro 
25 Arg Arg Lys Ala Lys He He Arg Asp Tyr Gly Lys Gin Met Ala Gly 
Asp Asp Cys Val Ala Ser Arg Gin Asp Glu Asp (SEQ ID NO: 8) . 

The present invention also relates to a substantially purified protein expressed 
from the DNA polynucleotide vaccines of the present invention, especially the 
purified proteins set forth below as SEQ ID NOs: 2, 4, 6, and 8. These purified 
30 proteins may be useful as protein-based HIV vaccines. 

The DNA backbone of the DNA vaccines of the present invention are 
preferably DNA plasmid expression vectors. DNA plasmid expression vectors are 
well known in the art and the present DNA vector vaccines may be comprised of any 
such expression backbone which contains at least a promoter for RNA polymerase 
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transcription, and a transcriptional terminator 3' to the HIV pol coding sequence. In 
one preferred embodiment, the promoter is the Rous sarcoma virus (RS V) long 
terminal repeat (LTR) which is a strong transcriptional promoter. A more preferred 
promoter is the cytomegalovirus promoter with the intron A sequence (CMV-intA). 
5 A preferred transcriptional terminator is the bovine growth hormone terminator. In 
addition, to assist in large scale preparation of an HIV pol DNA vector vaccine, an 
antibiotic resistance marker is also preferably included in the expression vector. 
Ampicillin resistance genes, neomycin resistance genes or any other pharmaceutically 
acceptable antibiotic resistance marker may be used. In a preferred embodiment of 

10 this invention, the antibiotic resistance gene encodes a gene product for neomycin 
resistance. Further, to aid in the high level production of the pharmaceutical by 
fermentation in prokaryotic organisms, it is advantageous for the vector to contain an 
origin of replication and be of high copy number. Any of a number of commercially 
available prokaryotic cloning vectors provide these benefits. In a preferred 

15 embodiment of this invention, these functionalities are provided by the commercially 
available vectors known as pUC. It is desirable to remove non-essential DNA 
sequences. Thus, the lacZ and lad coding sequences of pUC are removed in one 
embodiment of the invention. 

DNA expression vectors which exemplify but in no way limit the present 

20 invention are disclosed in PCT International Application No. PCT/US94/02751, 
International Publication No. WO 94/21797, hereby incorporated by reference. . A 
first DNA expression vector is the expression vector pnRSV, wherein the rous 
sarcoma virus (RSV) long terminal repeat (LTR) is used as the promoter. A second 
embodiment relates to plasmid VI, a mutated pBR322 vector into which the CMV 

25 promoter and the BGH transcriptional terminator is cloned. Another embodiment 
regarding DNA vector backbones relates to plasmid VI J. Plasmid VI J is derived 
from plasmid VI and removes promoter and transcription termination elements in 
order to place them within a more defined context, create a more compact vector, and 
to improve plasmid purification yields. Therefore, V1J also contains the CMVintA 

30 promoter and (BGH) transcription termination elements which control the expression 
of the HIV pol-based genes disclosed herein. The backbone of VI J is provided by 
pUClS. It is known to produce high yields of plasmid, is well-characterized by . 
sequence and function, and is of minimum size. The entire lac operon was removed 
and the remaining plasmid was purified from an agarose electrophoresis gel, 
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blunt-ended with the T4 DNA polymerase, treated with calf intestinal alkaline 
phosphatase, and ligated to the CMVintA/BGH element. In a preferred DNA 
expression vector, the ampicillin resistance gene is removed from VI J and replaced 
with a neomycin resistance gene, to generate VI Jneo. An especially preferred DNA 
5 expression vector is VI Jns, which is the same as VI J except that a unique Sfil 

restriction site has been engineered into the single Kpnl site at position 21 14 of VI J- 
neo. The incidence of Sfil sites in human genomic DNA is very low (approximately 
1 site per 100,000 bases). Thus, this vector allows careful monitoring for expression 
vector integration into host DNA, simply by Sfil digestion of extracted genomic 

,10 DNA. Yet another preferred DNA expression vector used as the backbone to the 
HTV-1 pol-based DNA vaccines of the present invention is V1R. In this vector, as 
much non-essential DNA as possible is "trimmed" from the vector to produce a highly 
compact vector. This vector is a derivative of VI Jns. This vector allows larger 
inserts to be used, with less concern that undesirable sequences are encoded and 

15 optimizes uptake by cells when the construct encoding specific influenza virus genes 
is introduced into surrounding tissue. The specific DNA vectors of the present 
invention include but are not limited to VI, V1J (SEQ ID NO:13), VlJneo (SEQ ID 
NO: 14), VI Jns (Figure 1A, SEQ ID NO: 15), V1R (SEQ ID NO:26), and any of the 
aforementioned vectors wherein a nucleotide sequence encoding a leader peptide, 

20 preferably the human tPA leader, is fused directly downstream of the CMV-intA 

promoter, including but not limited to VlJns-tpa, as shown in Figure IB and SEQ ID 
NO:28. 

The present invention especially relates to a DNA vaccine and a 
pharmaceutically active vaccine composition which contains this DNA vaccine, and 

25 , the use as prophylactic and/or therapeutic vaccine for host immunization, preferably 
human host immunization, against an HIV infection or to combat an existing HIV 
condition. These DNA vaccines are represented by codon optimized DNA molecules 
encoding HTV-1 Pol or biologically active Pol modifications or Pol-containing fusion 
proteins which are ligated within an appropriate DNA plasmid vector, with or without 

30 a nucleotide sequence encoding a functional leader peptide. DNA vaccines of the 
present invention may comprise codon optimized DNA molecules encoding HTV-1 
Pol or biologically active Pol modifications or Pol-containing fusion proteins ligated 
in DNA vectors VI, V1J (SEQ ID NO:14), VlJneo (SEQ ID NO;15), VlJns (Figure 
1A, SEQ ED NO: 16), V1R (SEQ ID NO:26), or any of the aforementioned vectors 
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wherein a nucleotide sequence encoding a leader peptide, preferably the human tPA 
leader, is fused directly downstream of the CMV-intA promoter, including but not 
limited to VlJns-tpa, as shown in Figure IB and SEQ ID NO:28. To this end, 
polynucleotide vaccine constructions include , VDns-wtpol and VIR-wtpol 
5 (comprising the DNA molecule encoding WT Pol, as set forth in SEQ ID NO:2), 
VI Jns-tPA-WTPol, (comprising the DNA molecule encoding tPA Pol, as set forth in 
SEQ ID NO:6), VUns-IAPol (comprising the DNA molecule encoding IA Pol, as set 
forth in SEQ ID NO:4), and VlJns-tPA-IAPol, (comprising the DNA molecule 
encoding tPA-IA Pol, as set forth in SEQ ID NO:8). Polynucleotide vaccine 

10 constructions VIR-wtpol, VUns-IAPol, and VlJns-tPA-IAPol, are exemplified in 
Example Sections 3-5. 

It will be evident upon review of the teaching within this specification that 
numerous vector/Pol antigen constructs may be generated. While the exemplified 
constructs are preferred, any number of vector/Pol antigen combinations are within 

15 the scope of the present invention, especially wild type or modified/inactivated Pol 
proteins which comprise at least one, preferably 5 or more and especially all nine 
mutations as shown in Table 1, with or without the inclusion of a leader sequence 
such as human tPA. 

The DNA vector vaccines of the present invention may be formulated in any 

20 pharmaceutical^ effective formulation for host administration. Any such formulation 
may be, for example, a saline solution such as phosphate buffered saline (PBS). 
It will be useful to utilize pharmaceutically acceptable formulations which also 
provide long-term stability of the DNA vector vaccines of the present invention. 
During storage as a pharmaceutical entity, DNA plasmid vaccines undergo a 

25 physiochemical change in which the supercoiled plasmid converts to the open circular 
and linear form. A variety of storage conditions (low pH, high temperature, low ionic 
strength) can accelerate this process. Therefore, the removal and/or chelation of trace 
metal ions (with succinic or malic acid, or with chelators containing multiple 
phosphate ligands) from the DNA plasmid solution, from the formulation buffers or 

30 from the vials and closures, stabilizes the DNA plasmid from this degradation 
pathway during storage. In addition, inclusion of non-reducing free radical 
scavengers, such as ethanol or glycerol, are useful to prevent damage of the DNA 
plasmid from free radical production that may still occur, even in apparently 
demetalated solutions. Furthermore, the buffer type, pH, salt concentration, light 
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exposure, as well as the type of sterilization process used to prepare the vials, may be 
controlled in the formulation to optimize the stability of the DNA vaccine. Therefore, 
formulations that will provide the highest stability of the DNA vaccine will be one 
that includes a demetalated solution containing a buffer (phosphate or bicarbonate) 
5 with a pH in the range of 7-8, a salt (NaCl, KC1 or LiCl) in the range of 100-200 mM, 
a metal ion chelator (e.g., EDTA, diethylenetriaminepenta-acetic acid (DTPA), 
malate, inositol hexaphosphate, tripolyphosphate or polyphosphoric acid), a non- 
reducing free radical scavenger (e.g. ethanol, glycerol, methionine or dimethyl 
sulfoxide) and the highest appropriate DNA concentration in. a sterile glass vial, ' 

10 packaged to protect the highly purified, nuclease free DNA from light. A particularly 
preferred formulation which will enhance long term stability of the DNA vector 
vaccines of the present invention would comprise a Tris-HCI buffer at a pH from 
about 8.0 to about 9.0; ethanol or glycerol at about 3% w/v; EDTA or DTPA in a 
concentration range up to about 5 mM; and NaCl at a concentration from about 50 

15 mM to about 500 mM. The use of such stabilized DNA vector vaccines and various 
alternatives to this preferred formulation range is described in detail in PCT 
International Application No. PCT/US97/06655 and PCT International Publication 
No. WO 97/40839, both of which are hereby incorporated by reference. 

The DNA vector vaccines of the present invention may also be formulated 

20 with an adjuvant or adjuvants which may increase immunogenicity of the DNA 
polynucleotide vaccines of the present invention. A number of these adjuvants are 
known in the art and are available for use in a DNA vaccine, including but not 
limited to particle bombardment using DNA-coated gold beads, co-administration 
of DNA vaccines with plasmid DNA expressing cytokines, chemokines, or 

25 costimulatory molecules, formulation of DNA with cationic lipids or with 
experimental adjuvants such as saponin, monophosphoryl lipid A or other 
compounds which increase immunogenicity of the DNA vaccine. Another 
adjuvant for use in the DNA vector vaccines of the present invention are one or 
more forms of an aluminum phosphate-based adjuvant wherein the aluminum 

30 phosphate-based adjuvant possesses a molar P0 4 /Al ratio of approximately 0.9: 
An additional mineral-based adjuvant may be generated from one or more forms 
of a calcium phosphate. These mineral-based adjuvants are useful in increasing 
cellular and humoral responses to DNA vaccination. These mineral-based 
compounds for use as DNA vaccines adjuvants are disclosed in PCT International 
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Application No. PCTYUS98/02414, PCT International Publication No. 
WO 98/35562, which is hereby incorporated by reference. Another preferred 
adjuvant is a non-ionic block copolymer which shows adjuvant activity with DNA 
vaccines. The basic structure comprises blocks of polyoxyethylene (POE) and 
5 polyoxypropylene (POP) such as a POE-POP-POE block copolymer. Newman et 
al. (1998, Critical Reviews in Therapeutic Drug Carrier Systems 15(2): 89-142) 
review a class of non-ionic block copolymers which show adjuvant activity. The 
basic structure comprises blocks of polyoxyethylene (POE) and polyoxypropylene 
(POP) such as a POE-POP-POE block copolymer. Newman et al. id, disclose 

10 that certain POE-POP-POE block copolymers may be useful as adjuvants to an 
influenza protein-based vaccine, namely higher molecular weight POE-POP-POE 
block copolymers containing a central POP block having a molecular weight of 
over about 9000 daltons to about 20,000 daltons and flanking POE blocks which 
comprise up to about 20% of the total molecular weight of the copolymer (see also 

15 U.S. Reissue Patent No. 36,665, U.S. Patent No. 5,567,859, U.S. Patent No. 

5,691,387, U.S. Patent No. 5,696,298 and U.S. Patent No. 5,990,241, all issued to 
Emanuele, et al., regarding these POE-POP-POE block copolymers). 
WO 96/04932 further discloses higher molecular weight POE/POP block 
copolymers which have surfactant characteristics and show biological efficacy .as 

20 vaccine adjuvants. The above cited references within this paragraph are hereby 
incorporated by reference in their entirety. It is therefore within the purview of 
the skilled artisan to utilize available adjuvants which may increase the immune 
response of the polynucleotide vaccines of the present invention in comparison to 
administration of a non-adjuvanted polynucleotide vaccine. 

25 The DNA vector vaccines of the present invention are administered to the host 

by any means known in the art, such as enteral and parenteral routes. These routes of 
delivery include but are not limited to intramusclar injection, intraperitoneal injection, 
intravenous injection, inhalation or intranasal delivery, oral delivery, sublingual 
administration, subcutaneous administration, transdermal administration, 

30 transcutaneous administration, percutaneous administration or any form of particle 
bombardment, such as a biolostic device such as a "gene gun" or by any available 
needle-free injection device. The preferred methods of delivery of the HTV-1 Pol- 
based DNA vaccines disclosed herein are intramuscular injection, subcutaneous 
administration and needle-free injection. An especially preferred method is 
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intramuscular delivery. 

The amount of expressible DNA to be introduced to a vaccine recipient will 
depend on the strength of the transcriptional and translational promoters used in the 
DNA construct, and on the immunogenicity of the expressed gene product. In 
5 general, an immunologically or prophylactically effective dose of about 1 [ig to 
greater than about 20 mg, and preferably in doses from about 1 mg to about 5 mg is 
administered directly into muscle tissue. As noted above, subcutaneous injection, 
intradermal introduction, impression through the skin, and other modes of 
administration such as intraperitoneal, intravenous, inhalation and oral delivery are 

10 also contemplated. It is also contemplated that booster vaccinations are to be 

provided in a fashion which optimizes the overall immune response to the Pol-based 
DNA vector vaccines of the present invention. 

The aforementioned polynucleotides, when directly introduced into a 
vertebrate in vivo, express the respective HIV-1 Pol protein within the animal and in 

15 turn induce a cellular immune response within the host to the expressed Pol antigen. 
To this end, the present invention also relates to methods of using the HTV-1 Pol- 
based polynucleotide vaccines of the present invention to provide effective 
immunoprophylaxis, to prevent establishment of an HIV-1 infection following 
exposure to this virus, or as a post-HTV infection therapeutic vaccine to mitigate the 

20 acute HIV-1 infection so as to result in the establishment of a lower virus load with 
beneficial long term consequences. As noted above, the present invention 
contemplates a method of administration or use of the DNA pol-based vaccines of the 
present invention using an any of the known routes of introducing polynucleotides 
into living tissue to induce expression of proteins. 

25 Therefore, the present invention provides for methods of using a DNA pol- 

based vaccine utilizing the various parameters disclosed herein as well as any 
additional parameters known in the art, which, upon introduction into mammalian 
tissue induces intracellular expression of these DNA pol-based vaccines. This 
intracellular expression of the Pol-based immunogen induces a cellular immune 

30 response which provides a substantial level of protection against an existing HIV-1 
infection or provides a substantial level of protection against a future infection in a 
presently uninfected host. 

The following examples are provided to illustrate the present invention 
without, however, limiting the same hereto. 
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EXAMPLE 1 
Vaccine Vectors 

VI - Vaccine vector VI was constructed from pCMVIE-AKI-DHFR (Whang 
5 et al., 1987, J. Virol .61: 1796). The AKI and DHFR genes were removed by cutting 
the vector with EcoRI and self-ligating. This vector does not contain intron A in the 
CMV promoter, so it was added as a PCR fragment that had a deleted internal SacI 
site [at 1855 as numbered in Chapman, et al, 1991, Nuc. Acids Res. 19: 3979). The 
template used for the PCR reactions was pCMVintA-Lux, made by ligating the 

10 Hindin and Nhel fragment from pCMV6al20 (see Chapman et al., ibid), which 

includes hCMV-IEl enhancer/promoter and intron A, into the HindHI and Xbal sites 
of pBL3 to generate pCMVIntBL. The 1881 base pair lucif erase gene fragment 
(Hindm-Smal Klenow filled-in) from RSV-Lux (de Wet et al, 1987, Mol Cell Biol 
7: 725) was ligated into the Sail site of pCMVIntBL, which was Klenow filled-in and . 

15 phosphatase treated. The primers that spanned intron A are: 5' primer: 5 -CTATAT 
A AGC AG AGCTCGTTTAG-3 ' (SEQ ID NO:10); 3' primer: 5-GTAGCAAA 
GATCTAAGGACGGTGACTGCAG-3 r (SEQ ID NO: 1 1). The primers used to 
remove the SacI site are: sense primer, 5-GTATGTGTCTGAAAATGAGCG 
TGG AG ATTGGGCTCGC AC-3 ' (SEQ ID NO: 12) and the antisense primer, ■ 

20 5-GTGCGAGCCCAATCTCCACGCTCATTTTCAGAC ACATAC-3' (SEQ ID 

NO: 13). The PCR fragment was cut with Sac I and Bgl II and inserted into the vector 
which had been cut with the same enzymes. 

V1J - Vaccine vector VI J was generated to remove the promoter and 
transcription termination elements from vector VI in order to place them within a 

25 more defined context, create a more compact vector, and to improve plasmid 

purification yields. V1J is derived from vectors VI and pUC18, a commercially 
available plasmid. VI was digested with Sspl and EcoRI restriction enzymes 
producing two fragments of DNA. The smaller of these fragments, containing the 
CMVintA promoter and Bovine Growth Hormone (BGH) transcription termination 

30 elements which control the expression of heterologous genes, was purified from an 
agarose electrophoresis gel. The ends of this DNA fragment were then "blunted" 
using the T4 DNA polymerase enzyme in order to facilitate its ligation to another 
"blunt-ended" DNA fragment. pUC18 was chosen to provide the "backbone" of the 
expression vector. It is known to produce high yields of plasmid, is well- 
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characterized by sequence and function, and is of small size. The entire lac operon 
was removed from this vector by partial digestion with the Haell restriction enzyme. 
The remaining plasmid was purified from an agarose electrophoresis gel, blunt-ended 
with the T4 DNA polymerase treated with calf intestinal alkaline phosphatase, and 
5 ligated to the CMVintA/BGH element described above. Plasmids exhibiting either of 
two possible orientations of the promoter elements within the pUC backbone were 
obtained. One of these plasmids gave much higher yields of DNA in E. coli and was 
designated V1J. This vector's structure was verified by sequence analysis of the 
junction regions and was subsequently demonstrated to give comparable or higher 
10 expression of heterologous genes compared with VL The nucleotide sequence of V1J 
is as follows: 

TCGCGCGTTT CGGTGATGAC GGTGAAAACC TCTGACACAT GCAGCTCCCG GAGACGGTCA 
CAGCTTGTCT GTAAGCGGAT GCCGGGAGCA GACAAGCCCG TCAGGGCGCG TCAGCGGGTG 
TTGGCGGGTG TCGGGGCTGG CTTAACTATG CGGCATCAGA GCAGATTGTA CTGAGAGTGC 

15 ACCATATGCG GTGTGAAATA CCGCACAGAT GCGTAAGGAG AAAATACCGC ATCAGATTGG 
CTATTGGCCA TTGCATACGT TGTATCCATA TCATAATATG TACATTTATA TTGGCTCATG 
TCCAACATTA CCGCCATGTT GACATTGATT ATTGACTAGT TATTAATAGT AATCAATTAC 
GGGGTCATTA GTTCATAGCC CATATATGGA GTTCCGCGTT ACATAACTTA CGGTAAATGG 
CCCGCCTGGC TGACCGCCCA ACGACCCCCG CCCATTGACG TCAATAATGA CGTATGTTCC • 

20 CATAGTAACG CCAATAGGGA CTTTCCATTG AC GTCAATGG GTGGAGTATT TACGGTAAAC 
TGCCCACTTG GCAGTACATC AAGTGTATCA TATGCCAAGT ACGCCCCCTA TTGACGTCAA 
TGACGGTAAA TGGCCCGCCT GGCATTATGC CCAGTACATG ACCTTATGGG ACTTTCCTAC 
TTGGCAGTAC ATCTACGTAT TAGTCATCGC TATTACCATG GTGATGCGGT TTTGGCAGTA 
CATCAATGGG CGTGGATAGC GGTTTGACTC ACGGGGATTT CCAAGTCTCC ACCCCATTGA 

25 CGTCAATGGG AGTTTGTTTT GGCACCAAAA TCAACGGGAC TTTCCAAAAT GTCGTAACAA 
CTCCGCCCCA TTGACGCAAA TGGGCGGTAG GCGTGTACGG TGGGAGGTCT ATATAAGCAG 
AGCTCGTTTA GTGAACCGTC AGATCGCCTG GAGACGCCAT CCACGCTGTT TTGAC CTCC A 
TAGAAGACAC CGGGACCGAT CCAGCCTCCG CGGCCGGGAA CGGTGCATTG GAACGCGGAT 
TCCCCGTGCC AAGAGTGACG TAAGTACCGC CTATAGAGTC TATAGGC C C A CCCCCTTGGC 

30 TTCTTATGCA TGCTATACTG TTTTTGGCTT GGGGTCTATA CACCCCCGCT TCCTCATGTT 
ATAGGTGATG GTATAGCTTA GCCTATAGGT GTGGGTTATT GACCATTATT GACCACTCCC 
CTATTGGTGA CGATACTTTC CATTACTAAT CCATAACATG GCTCTTTGCC ACAACTCTCT 
TTATTGGCTA TATGCCAATA CACTGTCCTT CAGAGACTGA CACGGACTCT GTATTTTTAC 
AGGATGGGGT CTCATTTATT ATTTACAAAT TCACATATAC AACACCACCG TCCCCAGTGC 
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CCGCAGTTTT TATTAAACAT AACGTGGGAT 
ACATGGGCTC TTCTCCGGTA GCGGCGGAGC 
CAGCGACTCA TGGTCGCTCG GCAGCTCCTT 
CAGCACGATG CCCACCACCA CCAGTGTGCC 
5 TGAAAATGAG CTCGGGGAGC GGGCTTGCAC 
GGCAGAAGAA GATGCAGGCA GCTGAGTTGT 
CGTTGCGGTG CTGTTAACGG TGGAGGGCAG 
GCGCGCCACC AGACATAATA GCTGACAGAC 
CTGCAGTCAC CGTCCTTAGA TCTGCTGTGC 
10 CCTCCCCCGT GCCTTCCTTG ACCCTGGAAG 
' ATGAGGAAAT TGCATCGCAT TGTCTGAGTA 
GGCAGCACAG CAAGGGGGAG. GATTGGGAAG 
GCTCTATGGG TACCCAGGTG CTGAAGAATT 
AGGCACATCC CCTTCTCTGT GACACACCCT 

15 CACTCATAGG ACACTCATAG CTCAGGAGGG 
TTGGAGCGGT CTCTCCCTCC CTCATCAGCC 
GAAGAAATTA AAGCAAGATA GGCTATTAAG 
TGAGGAAGTA ATGAGAGAAA TCATAGAATT 
CGCTCGGTCG TTCGGCTGCG GCGAGCGGTA 

20 TCCACAGAAT CAGGGGATAA CGCAGGAAAG 
AGGAACCGTA AAAAGGCCGC GTTGCTGGCG 
CATCACAAAA ATCGACGCTC AAGTCAGAGG 
CAGGCGTTTC CCCCTGGAAG CTCCCTCGTG 
GGATACCTGT CCGCCTTTCT CCCTTCGGGA 

25 AGGTATCTCA GTTCGGTGTA GGTCGTTCGC 
GTTCAGCCCG ACCGCTGCGC CTTATCCGGT 
CACGACTTAT CGCCACTGGC AGCAGCCACT 
GGCGGTGCTA CAGAGTTCTT GAAGTGGTGG 
TTTGGTATCT GCGCTCTGCT GAAGCCAGTT 

30 TCCGGCAAAC AAACCACCGC TGGTAGCGGT 
CGCAGAAAAA AAGGATCTCA AGAAGATCCT 
TGGAACGAAA ACTCACGTTA AGGGATTTTG 
TAGATCCTTT TAAATTAAAA ATGAAGTTTT 
TGGTCTGACA GTTACCAATG CTTAATCAGT 



CTCCACGCGA ATCTCGGGTA CGTGTTCCGG 
TTCTACATCC GAGCCCTGCT CCCATGCCTC 
GCTCCTAACA GTGGAGGCCA GACTTAGGCA 
GCACAAGGCC GTGGCGGTAG GGTATGTGTC 
CGCTGACGCA TTTGGAAGAC TTAAGGCAGC 
TGTGTTCTGA TAAGAGTCAG AGGTAACTCC 
TGTAGTCTGA GCAGTACTCG TTGCTGCCGC 
TAACAGACTG TTCCTTTCCA TGGGTCTTTT 
CTTCTAGTTG CCAGCCATCT GTTGTTTGCC 
GTGCCACTCC CACTGTCCTT TCCTAATAAA 
GGTGTCATTC TATTCTGGGG GGTGGGGTGG 
ACAATAGCAG GCATGCTGGG GATGCGGTGG 
GACCCGGTTC CTCCTGGGCC AGAAAGAAGC 
GTCCACGCCC CTGGTTCTTA GTTCCAGCCC 
CTCCGCCTTC AATCCCACCC GCTAAAGTAC 
CACCAAACCA AACCTAGCCT CCAAGAGTGG 
TGCAGAGGGA GAGAAAATGC CTCCAACATG 
TCTTCCGCTT CCTCGCTCAC TGACTCGCTG 
TCAGCTCACT CAAAGGCGGT AATACGGTTA 
AACATGTGAG CAAAAGGCCA GCAAAAGGCC 
TTTTTCCATA GGCTCCGCCC CCCTGACGAG 
TGGCGAAACC CGACAGGACT ATAAAGATAC 
CGCTCTCCTG TTCCGACCCT GCCGCTTACC 
AGCGTGGCGC TTTCTCAATG CTCACGCTGT 
TCCAAGCTGG GCTGTGTGCA CGAACCCCCC 
AACTATCGTC TTGAGTCCAA CCCGGTAAGA 
GGTAACAGGA TTAGCAGAGC GAGGTATGTA 
CCTAACTACG GCTACACTAG AAGGACAGTA 
ACCTTCGGAA AAAGAGTTGG TAGCTCTTGA 
GGTTTTTTTG TTTGCAAGCA GCAGATTACG 
TTGATCTTTT CTACGGGGTC TGACGCTCAG 
GTCATGAGAT TATCAAAAAG GATCTTCACC 
AAATCAATCT AAAGTATATA TGAGTAAACT 
GAGGCACCTA TCTCAGCGAT CTGTCTATTT 
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CGTTCATCCA TAGTTGCCTG ACTCCCCGTC GTGTAGATAA CTACGATACG GGAGGGCTTA 
CCATCTGGCC CCAGTGCTGC AATGATACCG CGAGACCCAC GCTCACCGGC TCCAGATTTA 
TCAGCAATAA ACCAGCCAGC CGGAAGGGCC GAGCGCAGAA GTGGTCCTGC AACTTTATCC 
GCCTCCATCC AGTCTATTAA TTGTTGCCGG GAAGCTAGAG TAAGTAGTTC GCCAGTTAAT 
5 AGTTTGCGCA ACGTTGTTGC CATTGCTACA GGCATCGTGG TGTCACGCTC GTCGTTTGGT 
ATGGCTTCAT TCAGCTCCGG TTCCCAACGA TCAAGGCGAG TTACATGATC CCCCATGTTG 
TGCAAAAAAG CGGTTAGCTC CTTCGGTCCT CCGATCGTTG TCAGAAGTAA GTTGGCCGCA 
GTGTTATC AC TCATGGTTAT GGCAGCACTG CATAATTCTC TTACTGTCAT GCCATCCGTA 
AGATGCTTTT CTGTGACTGG TGAGTACTCA ACCAAGTCAT TCTGAGAATA GTGTATGCGG 

10 CGACCGAGTT GCTCTTGCCC GGCGTCAATA CGGGATAATA CCGCGCCACA TAGCAGAACT 
TTAAAAGTGC TCATCATTGG AAAACGTTCT TCGGGGCGAA AACTCTCAAG GATCTTACCG 
CTGTTGAGAT CCAGTTCGAT GTAACCCACT CGTGCACCCA ACTGATCTTC AGCATCTTTT 
ACTTTCACCA GCGTTTCTGG GTGAGCAAAA ACAGGAAGGC AAAATGCCGC AAAAAAGGGA 
ATAAGGGCGA CACGGAAATG TTGAATACTC ATACTCTTCC TTTTTCAATA TTATTGAAGC 

15 ATTTATCAGG GTTATTGTCT CATGAGCGGA TACATATTTG AATGTATTTA GAAAAATAAA 
CAAATAGGGG TTCCGCGCAC ATTTCCCCGA AAAGTGCCAC CTGACGTCTA AGAAACCATT 
ATTATCATGA CATTAACCTA TAAAAATAGG CGTATCACGA GGCCCTTTCG TC (SEQ ID 
NO:14). 

VUneo - Construction of vaccine vector VI Jneo expression vector involved 
20 removal of the ampr gene and insertion of the kan*" gene (neomycin 

phosphotransferase). The ampr gene from the pUC backbone of VI J was removed by 
digestion with Sspl and Eaml 1051 restriction enzymes. The remaining plasmid was 
purified by agarose gel electrophoresis, blunt-ended with T4 DNA polymerase, and 
then treated with calf intestinal alkaline phosphatase. The commercially available 
25 kanr gene, derived from transposon 903 and contained within the pUC4K plasmid, 
was excised using the PstI restriction enzyme, purified by agarose gel electrophoresis, 
and blunt-ended with T4 DNA polymerase. This fragment was ligated with the VI J 
backbone and plasmids with the kan** gene in either orientation were derived which 
were designated as VI Jneo #*s 1 and 3. Each of these plasmids was confirmed by 
30 restriction enzyme digestion analysis, DNA sequencing of the junction regions, and 
was shown to produce similar quantities of plasmid as VI J. Expression of 
heterologous gene products was also comparable to VI J for these VI Jneo vectors. 
VI Jneo#3, referred to as VI Jneo hereafter, was selected which contains the kanr gene 
in the same orientation as the ampr gene in V1J as the expression construct and 
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provides resistance to neomycin, kanamycin and G418. The nucleotide sequence of 
VlJneo is as follows: 



TCGCGCGTTT 


CGGTGATGAC 


ppmpA A lAPp 


tp t p. a p 7A p a n -1 


VjL.iUaL 1 LLLG 


OA OA OPOmO'A 

GAGALGG 1 LA 


CAGCTTGTCT 


GTAAGCGGAT 


GL L GAGLA 


voAL AALjL L L G 


mo a ooooooo 
1 L AGGGLGLG 


mo A o O f~~* rr\ 

1 LAGLGGGTG 


TTGGCGGGTG 


TCGGGGCTGG 


pmnifl AOT»A TO 

L 1 1 AAL J. A 1 G 


PPOPAmO AO A 

LGGLA 1 LAG A 


oo 7\ o 7\ mmom 7\ 
GLAGATTGTA 


Omo 7\ O T\ pm/~<n 

C T G AG AGTGC 


ACCATATGCG 


GTGTGAAATA 


LLGL AC AGA 1 


OOOm7\ AOOAO 

GLG 1 AAGGAG 


a a a 7v m a r*r k /~*r* 
AAAA1 ALLGL 


7\ mo 7\ o t\ mmoo 

ATLAGATTGG 


CTATTGGCCA TTGCATACGT 


mom a mo o A m a 
1G1 A1LLA1A 


mo Am a at>t\ mo 
1 LA 1 AA 1 A 1G 


1ALA1 1 1 ATA 


mmo oomo t\ mo 

TTGGCTCATG 


TCCAACATTA 


CCGCCATGTT 


GALATTGATT 


7\ 1 1 hi ip 7\ rim T\ om 

AT TGALTAGT 


^^T. ^V ^TT^TT; ^V TV ^TTjT\ ^^OTJ 

TATTAATAGT 


AATCAATTAC 


GGGGTCATTA 


GTTCATAGCC 


CAIA1A1GGA 


ommooooomm 
Gl 1LLGLG1 1 


A07vm7\ t\ omrpT\ 
ALA 1 AALTT A 


ooom7v t\ t\ moo 

CGGTAAATGG 


CCCGCCTGGC 


TGACCGCCCA 


ACbALCL.LL.Cj 


ooo a mmo a oo 
LLLA1 1GALG 


mo a a mA a mo t\ 
1 L AA lAAlvxA 


oom7\ mo i m no o 

CGTATGTTCC 


CATAGTAACG 


CCAATAGGGA 


CTTTCCATTG 


a oomo ~A a moo 
ALG 1 LAATGG 


omoo 7\ om7\ mm 
GTGGAGTATT 


mil O/^/^fTlTV 7\ T\ /*i 

T AC G GT AAAC 


TGCCCACTTG 


GC AGTAC ATC 


AAGTGTATCA 


rpT, mooo 7a 7\ om 

TATGLLAAGT 


ACGCCCCCTA 


mmo t\ oom/"< tv t\ 

TTGACGTCAA 


TGACGGTAAA 


TGGCCCGCCT 


riortii mm 71 moo 

GGCATTATGC 


o07\om7\07\ mo 
L C AGTAC ATG 


7» oomm7v mooo 

AC CTTATGGG 


ACTTTCCTAC 


TTGGCAGTAC 


ATCTACGTAT 


rn 7v omo 7v mooo 

TAGTLATLGL 


m t\ rnm t\ oo 7\ mo 

lATTALCATG 


omo t\ mo ooo m 

GTGATGCGGT 


TTTGGCAGTA 


CATCAATGGG 


CGTGGATAGC 


/-» /~t mmm/~i 7v omo 

GGTTTGACTC 


7\ ooooota mmm 

ACGGGGATTT 


oo?\ Tvomomoo 

CCAAGTCTCC 


ACCCCATTGA 


CGTCAATGGG 


AGTTTGTTTT 


OOO 71 OO 7\ 7V 7\ 7A 

GGCACCAAAA 


mo 7v t\ c*r*f~</~i tv o 
TLAALGGGAL 


mmmoo t\ tv t\ tv m 

TTTCCAAAAT 


Amppmn tv n tv 

GTCGTAACAA 


CTCCGCCCCA 


TTGACGCAAA 


TGGGCGGTAG 


GCGTGTACGG 


mooOT\ oomom 

TGGGAGGTCT 


TV mTi f*Tl TV TV TV 

ATATAAGCAG 


AGCTCGTTTA GTGAACCGTC 


tv 07\ mo oo o mo 

AGATCGC c tg 


0 7\ OT\ (~* <~*r^ 7\ m 

G AGACGL CAT 


0071 PPPmPrrun 

CCACGCTGTT 


mmOTV i omo /tiv 

TTGACCTCCA 


TAGAAGACAC 


CGGGACCGAT 


oo i\ iTTrnnn o 

CCAGCCTCCG 


ri/-i0000007\ tv 

C GGC C GGGAA 


z~i p/"imo o tv mi l ip 

C GGTGCATTG 


OTV tv PP<ppptv m 

GAACGCGGAT 


TCCCCGTGCC 


AAGAGTGACG 


mil 7\ om7\ (~*r**f* 

TAAGTACCGC 


om a m a o a omo 
C1A1 AGAG1C 


m 7\ m t\ t**ri/-*r*f~* tv 
1A1AGGCCCA 


0/"iO/"^/"immP /~ip 

CCCCCTTGGC 


TTCTTATGCA 


TGCTATACTG 


1111 1 v^GC 1 1 


oooomom AmA 
GGGG1L 1A1 A 


CACCCCCGC I 


TCCTCArGTT 


ATAGGTGATG 


GTATAGCTTA 


ooota m a oom 
GLL1A1AGG1 


omoo omm a mm 
G1GGG1 1A1 1 


o a oo a mm a mm 
GACCA1 1A1 1 


o aoo a f~*TTif~* 
GALLAL 1 LLL 


CTATTGGTGA 


CGATACTTTC 


o AmmA omj\ at 
LA 1 1 AL 1AA1 


PfATfliPfl mr> 
ULnliiA^HlVj 


ppmpmmmppp 
GL 1 L 1 1 1 GLL 


ao a a omomom 
AL AAL 1 L 1 L 1 


TTATTGGCTA 


TATGCCAATA 


LAL 1 Lx 1 L LI 1 


rarTAPar^rTi 

LAtj.rlVji\L 1 VjA 


PTAPPPAPTPm 
L ALGGAL 1 L 1 


omAmmmmm a o 
G1A1 111 1 AL 


AGGATGGGGT 


CTCATTTATT 


ATTTACAAAT 


TCACATATAC 


AACACCACCG 


TCCCCAGTGC 


CCGCAGTTTT 


TATTAAACAT 


AACGTGGGAT 


CTCCACGCGA 


ATCTCGGGTA 


CGTGTTCCGG 


ACATGGGCTC 


TTCTCCGGTA 


GCGGCGGAGC 


TTCTACATCC 


GAGCCCTGCT 


CCCATGCCTC 


CAGCGACTCA 


TGGTCGCTCG 


GCAGCTCCTT 


GCTCCTAACA 


GTGGAGGCCA 


GACTTAGGCA 


CAGCACGATG 


CCCACCACCA 


CCAGTGTGCC 


GCACAAGGCC 

• 


GTGGCGGTAG 


GGTATGTGTC 


TGAAAATGAG 


CTCGGGGAGC 


GGGCTTGCAC 


CGCTGACGCA 


TTTGGAAGAC 


TTAAGGCAGC 


GGCAGAAGAA 


GATGCAGGCA 


GCTGAGTTGT 


TGTGTTCTGA 


TAAGAGTCAG 


AGGTAACTCC 


CGTTGCGGTG 


CTGTTAACGG 


TGGAGGGCAG 


TGTAGTCTGA 


GCAGTACTCG 


TTGCTGCCGC 


GCGCGCCACC 


AGACATAATA 


GCTGACAGAC 


TAACAGACTG 


TTCCTTTCCA 


TGGGTCTTTT 


CTGCAGTCAC 


CGTCCTTAGA 


TCTGCTGTGC 


CTTCTAGTTG 


CCAGCCATCT 


GTTGTTTGCC 
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CCTCCCCCGT GCCTTCCTTG ACCCTGGAAG GTGCCACTCC CACTGTCCTT TCCTAATAAA 
ATGAGGAAAT TGCATCGCAT TGTCTGAGTA GGTGTCATTC TATTCTGGGG GGTGGGGTGG 
GGCAGCACAG CAAGGGGGAG GATTGGGAAG ACAATAGCAG GCATGCTGGG GATGCGGTGG 
GCTCTATGGG TACCCAGGTG CTGAAGAATT GACCCGGTTC CTCCTGGGCC AGAAAGAAGC 
5 AGGCACATCC CCTTCTCTGT GACACACCCT GTCCACGCCC CTGGTTCTTA GTTCCAGCCC 
CACTCATAGG ACAC TCATAG CTCAGGAGGG CTCCGCCTTC AATCCCACCC GCTAAAGTAC 
TTGGAGCGGT CTCTCCCTCC CTCATCAGCC CACCAAACCA AACCTAGCCT CCAAGAGTGG 
GAAGAAATTA AAGCAAGATA GGCTATTAAG TGCAGAGGGA GAGAAAATGC CTCCAACATG 
TGAGGAAGTA ATGAGAGAAA TCATAGAATT TCTTCCGCTT CCTCGCTCAC TGACTCGCTG 
10 CGCTCGGTCG TTCGGCTGCG GCGAGCGGTA TCAGCTCACT CAAAGGCGGT AATACGGTTA 
TCCACAGAAT CAGGGGATAA CGCAGGAAAG AACATGTGAG CAAAAGGCCA GCAAAAGGCC 
AGGAACCGTA AAAAGGCCGC GTTGCTGGCG TTTTTCCATA GGCTCCGCCC CCCTGACGAG 
CATCACAAAA ATCGACGCTC AAGTCAGAGG TGGCGAAACC CGACAGGACT ATAAAGATAC 
CAGGCGTTTC CCCCTGGAAG CTCCCTCGTG CGCTCTCCTG TTCCGACCCT GCCGCTTACC 
15 GGATACCTGT CCGCCTTTCT CCCTTCGGGA AGCGTGGCGC TTTCTCAATG CTCACGCTGT 
AGGTATCTCA GTTCGGTGTA GGTCGTTCGC TCCAAGCTGG GCTGTGTGCA CGAACCCCCC 
GTTCAGCCCG ACCGCTGCGC CTTATCCGGT AACTATCGTC TTGAGTCCAA CCCGGTAAGA 
CACGACTTAT CGCCACTGGC AGCAGCCACT GGTAACAGGA TTAGCAGAGC GAGGTATGTA 
GGCGGTGCTA CAGAGTTCTT GAAGTGGTGG CCTAACTACG GCTACACTAG AAGGACAGTA 
20 TTTGGTATCT GCGCTCTGCT GAAGCCAGTT ACCTTCGGAA AAAGAGTTGG TAGCTCTTGA 
TCCGGCAAAC AAACCACCGC TGGTAGCGGT GGTTTTTTTG TTTGCAAGCA GCAGATTACG 
CGCAGAAAAA AAGGATCTCA AGAAGATCCT TTGATCTTTT CTACGGGGTC TGACGCTCAG 
TGGAACGAAA ACTCACGTTA AGGGATTTTG GTCATGAGAT TATCAAAAAG GATCTTCACC 
TAGATCCTTT TAAATTAAAA ATGAAGTTTT AAATCAATCT AAAGTATATA TGAGTAAACT 
25 TGGTCTGACA GTTACCAATG CTTAATCAGT GAGGCACCTA TCTCAGCGAT CTGTCTATTT 
CGTTCATCCA TAGTTGCCTG ACTCCGGGGG GGGGGGGCGC TGAGGTCTGC CTCGTGAAGA 
AGGTGTTGCT GACTCATACC AGGCCTGAAT CGCCCCATCA TCCAGCCAGA AAGTGAGGGA 
GCCACGGTTG ATGAGAGCTT TGTTGTAGGT GGACCAGTTG GTGATTTTGA ACTTTTGCTT 
TGCCACGGAA CGGTCTGCGT TGTCGGGAAG ATGCGTGATC TGATCCTTCA ACTCAGCAAA 
30 AGTTCGATTT ATTCAACAAA GCCGCCGTCC CGTCAAGTCA GCGTAATGCT CTGCCAGTGT 
TACAACCAAT TAACCAATTC TGATTAGAAA AACTCATCGA GCATCAAATG AAACTGCAAT 
TTATTCATAT CAGGATTATC AATACCATAT TTTTGAAAAA GCCGTTTCTG TAATGAAGGA 
GAAAACTCAC CGAGGCAGTT CCATAGGATG GCAAGATCCT GGTATCGGTC TGCGATTCCG 
ACTCGTCCAA CATCAATACA ACCTATTAAT TTCCCCTCGT CAAAAATAAG GTTATCAAGT 
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GAG AAATP AP 


PATPAPTPAP 


GACTGAATCC 


GGTGAGAATG 


PrP A A A APP TT 


A TPP A f T u T w T l P T 1 




TTCPAPAPTT 


HTTP A AC A HP 


CCAGCCATTA CGCTCGTCAT 


PAAAATPAPT 


PPPATPAaPP 




AAAPPP/FTAT 


TPATTPPTPA 


TTGCGCCTGA 


GCGAGACGAA 


ATAPPPPATP 


PPTPTTa a a a 




PPAP AA'PTAP 


a a a p a f^pa a t 


CGAATGCAAC 


CGGCGCAGGA 


APaPTPPPap 


p pp a tp ?\ a o a 
L-IjL-Ax LAAUA 


■J 


A T A TTTTP A P 
nlnx 111 UAL 


L. x VjAA X v_ AVao 


ATATTCTTCT 


AATACCTGGA 


A 1 lollll 


UULVj^JjlaAxv., 




PP A nTCm r PP a 


PTa appaTPP 


a tt* a Tr* a pp a 

A x L. A 1 LAbbA 


P r rappPA f na a 

lj 1 Av-^aVjA 1 AA 


AAlvsLl 1 bnl 


Kj K3 1 L AAljA 






p pptp a ppp a 


GTTTAGTCTG 


ACCATCTCAT 


ptpti a p a t*p 
LiblAALAlL 


All (jjCjL AAUCj 




CTACCTTTGC 


CATGTTTCAG 


AAACAACTCT 


GGCGCATCGG 


GCTTCCCATA 


CAATCGATAG 




ATTGTCGCAC 


CTGATTGCCC 


GACATTATCG 


CGAGCCCATT 


TATACCCATA 


TAAATCAGCA 


10 


TCCATGTTGG 


AATTTAATCG 


CGGCCTCGAG 


CAAGACGTTT 


CCCGTTGAAT 


ATGGCTCATA 




ACACCCCTTG 


TATTACTGTT 


TATGTAAGCA 


GACAGTTTTA 


TTGTTCATGA 


TGATATATTT 




TTATCTTGTG 


CAATGTAACA 


TCAGAGATTT 


TGAGACACAA 


CGTGGCTTTC 


cccccccccc 




CATTATTGAA 


GCATTTATCA 


GGGTTATTGT 


CTCATGAGCG 


GATACATATT 


TGAATGTATT 




TAGAAAAATA 


AACAAATAGG 


GGTTCCGCGC 


ACATTTCCCC 


GAAAAGTGCC 


ACCTGACGTC 


15 


TAAGAAACCA 


TTATTATCAT 


GACATTAACC 


TATAAAAATA 


GGCGTATCAC 


GAGGCCCTTT 



CGTC (SEQ ID NO: 15) . 

VUns - The expression vector VDns was generated by adding an Sfil site to 
VlJneo to facilitate integration studies. A commercially available 13 base pair Sfil 
linker (New England BioLabs) was added at the Kpnl site within the BGH sequence 

20 of the vector. VlJneo was linearized with Kpnl, gel purified, blunted by T4 DNA 
polymerase, and ligated to the blunt Sfil linker. Clonal isolates were chosen by 
restriction mapping and verified by sequencing through the linker. The new vector 
was designated VI Jns. Expression of heterologous genes in VI Jns (with Sfil) was 
comparable to expression of the same genes in VlJneo (with Kpnl). 

25 The nucleotide sequence of VI Jns is as follows: 

TCGCGCGTTT CGGTGATGAC GGTGAAAACC TCTGACACAT GCAGCTCCCG GAGACGGTCA 
CAGCTTGTCT GTAAGCGGAT GCCGGGAGCA GACAAGCCCG TCAGGGCGCG TCAGCGGGTG 
TTGGCGGGTG TCGGGGCTGG CTTAACTATG CGGCATCAGA GCAGATTGTA CTGAGAGTGC 
ACCATATGCG GTGTGAAATA CCGCACAGAT GCGTAAGGAG AAAATACCGC ATCAGATTGG 
30 CTATTGGCCA TTGCATACGT TGTATCCATA TCATAATATG TACATTTATA TTGGCTCATG 
TCCAACATTA CCGCCATGTT GACATTGATT ATTGACTAGT TATTAATAGT AATCAATTAC 
GGGGTCATTA GTTCATAGCC CATATATGGA GTTCCGCGTT ACATAACTTA CGGTAAATGG 
CCCGCCTGGC TGACCGCCCA ACGACCCCCG CCCATTGACG TCAATAATGA CGTATGTTCC 
CATAGTAACG CCAATAGGGA CTTTCCATTG ACGTCAATGG GTGGAGTATT TACGGTAAAC 

-42- 



WO 01/45748 



PCT/US00/34724 



TGCCCACTTG GCAGTACATC AAGTGTATCA TATGCCAAGT ACGCCCCCTA TTGACGTCAA 
TGACGGTAAA TGGCCCGCCT GGCATTATGC CCAGTACATG ACCTTATGGG ACTTTCCTAC 
TTGGCAGTAC ATCTACGTAT TAGTCATCGC TATTACCATG GTGATGCGGT TTTGGCAGTA 
CATCAATGGG CGTGGATAGC GGTTTGACTC ACGGGGATTT CCAAGTCTCC ACCCCATTGA 
5 CGTCAATGGG AGTTTGTTTT GGCACCAAAA TCAACGGGAC TTTCCAAAAT GTCGTAACAA 
CTCCGCCCCA TTGACGCAAA TGGGCGGTAG GCGTGTACGG TGGGAGGTCT ATATAAGCAG 
AGCTCGTTTA GTGAACCGTC AGATCGCCTG GAGACGCCAT CCACGCTGTT TTGACCTCCA 
TAGAAGACAC CGGGACCGAT CCAGCCTCCG CGGCCGGGAA CGGTGCATTG GAACGCGGAT 
TCCCCGTGCC AAGAGTGACG TAAGTACCGC CTATAGACTC TATAGGCACA CCCCTTTGGC 
10 TCTTATGCAT GCTATACTGT TTTTGGCTTG GGGCCTATAC ACCCCCGCTT CCTTATGCTA 
TAGGTGATGG TATAGCTTAG CCTATAGGTG TGGGTTATTG ACCATTATTG ACCACTCCCC 
TATTGGTGAC GATACTTTCC • ATTACTAATC CATAACATGG CTCTTTGCCA CAACTATCTC 
TATTGGCTAT ATGCCAATAC TCTGTCCTTC AGAGACTGAC ACGGACTCTG TATTTTTACA 
GGATGGGGTC CCATTTATTA TTTACAAATT CACATATACA ACAACGCCGT CCCCCGTGCC 
15 CGCAGTTTTT ATTAAACATA GCGTGGGATC TCCACGCGAA TCTCGGGTAC GTGTTCCGGA 
CATGGGCTCT TCTCCGGTAG CGGCGGAGCT TCCACATCCG AGCCCTGGTC CCATGCCTCC 
AGCGGCTCAT GGTCGCTCGG CAGCTCCTTG CTCCTAACAG TGGAGGCCAG ACTTAGGCAC 
AGCACAATGC CCACCACCAC CAGTGTGCCG CACAAGGCCG TGGCGGTAGG GTATGTGTCT 
GAAAATGAGC GTGGAGATTG GGCTCGCACG GCTGACGCAG ATGGAAGACT TAAGGCAGCG' 
20 GCAGAAGAAG ATGCAGGCAG CTGAGTTGTT GTATTCTGAT AAGAGTCAGA GGTAACTCCC 
GTTGCGGTGC TGTTAACGGT GGAGGGCAGT GTAGTCTGAG CAGTACTCGT TGCTGCCGCG 
CGCGCCACCA GACATAATAG CTGACAGACT AACAGACTGT TCCTTTCCAT GGGTCTTTTC 
TGCAGTCACC GTCCTTAGAT CTGCTGTGCC TTCTAGTTGC CAGCCATCTG TTGTTTGCCC 
CTCCCCCGTG CCTTCCTTGA CCCTGGAAGG TGCCACTCCC ACTGTCCTTT CCTAATAAAA 
25. TGAGGAAATT GCATCGCATT GTCTGAGTAG GTGTCATTCT ATTCTGGGGG GTGGGGTGGG 
GCAGGACAGC AAGGGGGAGG ATTGGGAAGA CAATAGCAGG CATGCTGGGG ATGCGGTGGG 
CTCTA TGGCC GCTGCGGCCA GGTGCTGAAG AATTGACCCG GTTCCTCCTG GGCCAGAAAG 
AAGCAGGCAC ATCCCCTTCT CTGTGACACA CCCTGTCCAC GCCCCTGGTT CTTAGTTCCA 
GCCCCACTCA TAGGACACTC ATAGCTCAGG AGGGCTCCGC CTTCAATCCC ACCCGCTAAA 
30 GTACTTGGAG CGGTCTCTCC CTCCCTCATC AGCCCACCAA ACCAAACCTA GCCTCCAAGA 
GTGGGAAGAA ATTAAAGCAA GATAGGCTAT TAAGTGCAGA GGGAGAGAAA ATGCCTCCAA 
CATGTGAGGA AGTAATGAGA GAAATCATAG AATTTCTTCC GCTTCCTCGC TCACTGACTC 
GCTGCGCTCG GTCGTTCGGC TGCGGCGAGC GGTATCAGCT CACTCAAAGG CGGTAATACG 
GTTATCCACA GAATCAGGGG ATAACGCAGG AAAGAACATG TGAGCAAAAG GCCAGCAAAA 



-43- 



WO 01/45748 



PCT/US00/34724 



GGCCAGGAAC CGTAAAAAGG CCGCGTTGCT 
CGAGCATCAC AAAAATCGAC GCTCAAGTCA 
ATACCAGGCG TTTCCCCCTG GAAGCTCCCT 
TACCGGATAC CTGTCCGCCT TTCTCCCTTC 
5 CTGTAGGTAT CTCAGTTCGG TGTAGGTCGT 
CCCCGTTCAG CCCGACCGCT GCGCCTTATC 
AAGACACGAC TTATCGCCAC TGGCAGCAGC 
TGTAGGCGGT GCTACAGAGT TCTTGAAGTG 
AGTATTTGGT ATCTGCGCTC TGCTGAAGCC 

10 TTGATCCGGC AAACAAACCA CCGCTGGTAG 
TACGCGCAGA AAAAAAGGAT CTCAAGAAGA 
TCAGTGGAAC GAAAACTCAC GTTAAGGGAT 
CACCTAGATC CTTTTAAATT AAAAATGAAG 
AACTTGGTCT GACAGTTACC AATGCTTAAT 

15 ATTTCGTTCA TCCATAGTTG CCTGACTCGG 
AGAAGGTGTT GCTGACTCAT ACCAGGCCTG 
GGAGCCACGG TTGATGAGAG CTTTGTTGTA 
CTTTGCCACG GAACGGTCTG CGTTGTCGGG 
AAAAGTTCGA TTTATTCAAC AAAGCCGCCG 

20 TGTTACAACC AATTAACCAA TTCTGATTAG 
AATTTATTCA TATCAGGATT ATCAATACCA 
GGAGAAAACT CACCGAGGCA GTTCCATAGG 
CCGACTCGTC CAACATCAAT ACAACCTATT 
AGTGAGAAAT CACCATGAGT GACGACTGAA 

25 TCTTTCCAGA CTTGTTCAAC AGGCCAGCCA 
ACCAAACCGT TATTCATTCG TGATTGCGCC 
AAAGGACAAT TACAAACAGG AATCGAATGC 
ACAATATTTT CACCTGAATC AGGATATTCT 
ATCGCAGTGG TGAGTAACCA TGCATCATCA 

30 AGAGGCATAA ATTCCGTCAG CCAGTTTAGT 
ACGCTACCTT TGCCATGTTT CAGAAACAAC 
TAGATTGTCG CACCTGATTG CCCGACATTA 
GCATCCATGT TGGAATTTAA TCGCGGCCTC 
ATAACACCCC TTGTATTACT GTTTATGTAA 



GGCGTTTTTC CATAGGCTCC GCCCCCCTGA 
GAGGTGGCGA AACCCGACAG GACTATAAAG 
CGTGCGCTCT CCTGTTCCGA CCCTGCCGCT 
GGGAAGCGTG GCGCTTTCTC ATAGCTCACG 
TCGCTCCAAG CTGGGCTGTG TGCACGAACC 
CGGTAACTAT CGTCTTGAGT CCAACCCGGT 
CACTGGTAAC AGGATTAGCA GAGCGAGGTA 
GTGGCCTAAC TACGGCTACA CTAGAAGAAC 
AGTTACCTTC GGAAAAAGAG TTGGTAGCTC 
CGGTGGTTTT TTTGTTTGCA AGCAGCAGAT 
TCCTTTGATC TTTTCTACGG GGTCTGACGC 
TTTGGTCATG AGATTATCAA AAAGGATCTT 
TTTTAAATCA ATCTAAAGTA TATATGAGTA 
CAGTGAGGCA CCTATCTCAG CGATCTGTCT 
GGGGGGGGGG CGCTGAGGTC TGCCTCGTGA 
AATCGCCCCA TCATCCAGCC AGAAAGTGAG 
GGTGGACCAG TTGGTGATTT TGAACTTTTG 
AAGATGCGTG ATCTGATCCT TCAACTCAGC 
TCCCGTCAAG TCAGCGTAAT GCTCTGCCAG 
AAAAACTCAT CGAGCATCAA ATGAAACTGC 
TATTTTTGAA AAAGCCGTTT CTGTAATGAA 
ATGGCAAGAT CCTGGTATCG GTCTGCGATT 
AATTTCCCCT CGTCAAAAAT AAGGTTATCA 
TCCGGTGAGA ATGGCAAAAG CTTATGCATT 
TTACGCTCGT CATCAAAATC ACTCGCATCA 
TGAGCGAGAC GAAATACGCG ATCGCTGTTA 
AACCGGCGCA GGAACACTGC CAGCGCATCA 
TCTAATACCT GGAATGCTGT TTTCCCGGGG 
GGAGTACGGA TAAAATGCTT GATGGTCGGA 
CTGACCATCT CATCTGTAAC ATCATTGGCA 
TCTGGCGCAT CGGGCTTCCC ATACAATCGA 
TCGCGAGCCC ATTTATACCC ATATAAATCA 
GAGCAAGACG TTTCCCGTTG AATATGGCTC 
GCAGACAGTT TTATTGTTCA TGATGATATA 
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TTTTTATCTT GTGCAATGTA ACATCAGAGA TTTTGAGACA CAACGTGGCT TTCCCCCCCC 
CCCCATTATT GAAGCATTTA TCAGGGTTAT TGTCTCATGA GCGGATACAT ATTTGAATGT 
ATTTAGAAAA ATAAACAAAT AGGGGTTCCG CGCACATTTC CCCGAAAAGT GCCACCTGAC 
GTCTAAGAAA CCATTATTAT CATGACATTA ACCTATAAAA ATAGGCGTAT CACGAGGCCC 
TTTCGTC ( SEQ ID NO:16) . . 

The underlined nucleotides of SEQ ID NO: 1 6 represent the Sfi 1 site 
introduced into the Kpn 1 site of VlJneo. 

VUns-tPA - The vaccine vector VlJns-tPA was constructed in order to fuse 
an heterologous leader peptide sequence to the pol DNA constructs of the present 
invention. More specifically, the vaccine vector VlJns was modified to include the 
human tissue-specific plasminogen activator (tPA) leader. As an exemplification, but 
by no means a limitation of generating a pol DNA construct comprising an amino'- 
terminal leader sequence, plasmid VlJneo was modified to include the human tissue- 
specific plasminogen activator (tPA) leader. Two synthetic complementary oligomers 
were annealed and then ligated into VlJneo which had been Bgin digested. The 
sense and antisense oligomers were 5*-GATCACCATGGATGCAATGAAGAG 

AGGGCTCTGCTGTGTGCTGCTGCTGTGTGGAGCAGTCTTCGTTTCGCCCAG 
CGA-3* (SEQ ID NO: 17); and, 5'-GATCTCGGTGGGCGAAACGAAGACrGCTCC 

ACACAGCAGCAGCACACAGCAGAGCCCTCTCTTCATTGCATCCATGGT-3' 
(SEQ ID NO: 18). The Kozak sequence is underlined in the sense oligomer. These 
oligomers have overhanging bases compatible for ligation to BglU-cleaved sequences. 
After ligation the upstream BglH site is destroyed while the downstream Bgffl is 
retained for subsequent ligations. Both the junction sites as well as the entire tPA 
leader sequence were verified by DNA sequencing. Additionally, in order to conform 
with VlJns (=VlJneo with an Sfil site), an Sfil restriction site was placed at the Kpnl 
site within the BGH terminator region of VI Jneo-tPA by blunting the Kpnl site with 
T4 DNA polymerase followed by ligation with an Sfil linker (catalogue #1138, New 
England Biolabs), resulting in VI Jns-tPA. This modification was verified by ' 
restriction digestion and agarose gel electrophoresis. 
30 The VlJns-tpa vector nucleotide sequence is as follows: 

TCGCGCGTTT CGGTGATGAC GGTGAAAACC TCTGACACAT GCAGCTCCCG GAGACGGTCA 
CAGCTTGTCT GTAAGCGGAT GCCGGGAGCA GACAAGCCCG TCAGGGCGCG TCAGCGGGTG 
TTGGCGGGTG TCGGGGCTGG CTTAACTATG CGGCATCAGA GCAGATTGTA CTGAGAGTGC 
ACCATATGCG GTGTGAAATA CCGCACAGAT GCGTAAGGAG AAAATACCGC ATCAGATTGG 
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CTATTGGCCA TTGCATACGT TGTATCCATA TCATAATATG TACATTTATA TTGGCTCATG 
TCCAACATTA CCGCCATGTT GACATTGATT ATTGACTAGT TATTAATAGT AATCAATTAC 
GGGGTCATTA GTTCATAGCC CATATATGGA GTTCCGCGTT ACATAACTTA CGGTAAATGG 
CCCGCCTGGC TGACCGCCCA ACGACCCCCG CCCATTGACG TCAATAATGA CGTATGTTCC 

5 CATAGTAACG CCAATAGGGA CTTTCCATTG ACGTCAATGG GTGGAGTATT TACGGTAAAC 
TGCCCACTTG GCAGTACATC AAGTGTATCA TATGCCAAGT ACGCCCCCTA TTGACGTCAA 
TGACGGTAAA TGGCCCGCCT GGCATTATGC CCAGTACATG ACCTTATGGG ACTTTCCTAC 
TTGGCAGTAC ATCTACGTAT TAGTCATCGC TATTACCATG GTGATGCGGT TTTGGCAGTA 
CATCAATGGG CGTGGATAGC GGTTTGACTC ACGGGGATTT CCAAGTCTCC ACCCCATTGA 

0 CGTCAATGGG AGTTTGTTTT GGCACCAAAA TCAACGGGAC TTTCCAAAAT GTCGTAACAA 
CTCCGCCCCA TTGACGCAAA TGGGCGGTAG GCGTGTACGG TGGGAGGTCT ATATAAGCAG 
AGCTCGTTTA GTGAACCGTC AGATCGCCTG GAGACGCCAT CCACGCTGTT TTGACCTCCA 
TAGAAGACAC CGGGACCGAT CCAGCCTCCG CGGCCGGGAA CGGTGCATTG GAACGCGGAT 
TCCCCGTGCC AAGAGTGACG TAAGTACCGC CTATAGACTC TATAGGCACA CCCCTTTGGC 

5 TCTTATGCAT GCTATACTGT TTTTGGCTTG GGGCCTATAC ACCCCCGCTT CCTTATGCTA 
TAGGTGATGG TATAGCTTAG CCTATAGGTG TGGGTTATTG ACCATTATTG ACCACTCCCC 
TATTGGTGAC GATACTTTCC ATTACTAATC CATAACATGG CTCTTTGCCA CAACTATCTC 
TATTGGCTAT ATGCCAATAC TCTGTCCTTC AGAGACTGAC ACGGACTCTG TATTTTTACA 
GGATGGGGTC CCATTTATTA TTTACAAATT CACATATACA ACAACGCCGT CCCCCGTGCC 

0 CGCAGTTTTT ATTAAACATA GCGTGGGATC TCCACGCGAA TCTCGGGTAC GTGTTCCGGA 
CATGGGCTCT TCTCCGGTAG CGGCGGAGCT TCCACATCCG AGCCCTGGTC CCATGCCTCC 
AGCGGCTCAT GGTCGCTCGG CAGCTCCTTG CTCCTAACAG TGGAGGCCAG ACTTAGGCAC 
AGCACAATGC CCACCACCAC CAGTGTGCCG CACAAGGCCG TGGCGGTAGG GTATGTGTCT 
GAAAATGAGC GTGGAGATTG GGCTCGCACG GCTGACGCAG ATGGAAGACT TAAGGCAGCG 

5 GCAGAAGAAG ATGCAGGCAG CTGAGTTGTT GTATTCTGAT AAGAGTCAGA GGTAACTCCC 
GTTGCGGTGC TGTTAACGGT GGAGGGCAGT GTAGTCTGAG CAGTACTCGT TGCTGCCGCG 
CGCGCCACCA GACATAATAG CTGACAGACT AACAGACTGT TCCTTTCCAT GGGTCTTTTC 
TGCAGTCACC GTCCT TAGAT CACCATGGAT GCAATGAAGA GAGGGCTCTG CTGTGTGCTG 
CTGCTGTGTG GAGCAGTCTT CGTTTCGCCC AGCGAGATCT GCTGTGCCTT CTAGTTGCCA 

3 GCCATCTGTT GTTTGCCCCT CCCCCGTGCC TTCCTTGACC CTGGAAGGTG CCACTCCCAC 
TGTCCTTTCC TAATAAAATG AGGAAATTGC ATCGCATTGT CTGAGTAGGT GTCATTCTAT 
TCTGGGGGGT GGGGTGGGGC AGGACAGCAA GGGGGAGGAT TGGGAAGACA ATAGCAGGCA 
TGCTGGGGAT GCGGTGGGCT CTATGGCCGC TGC GGCC AGG TGCTGAAGAA TTGACCCGGT 
TCCTCCTGGG CCAGAAAGAA GCAGGCACAT CCCCTTCTCT GTGACACACC CTGTCCACGC 
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CCCTGGTTCT TAGTTCCAGC CCCACTCATA GGACACTCAT AGCTCAGGAG GGCTCCGCCT 
TCAATCCCAC CCGCTAAAGT ACTTGGAGCG GTCTCTCCCT CCCTCATCAG CCCACCAAAC 
CAAACCTAGC CTCCAAGAGT GGGAAGAAAT TAAAGCAAGA TAGGCTATTA AGTGCAGAGG 
GAGAGAAAAT GCCTCCAACA TGTGAGGAAG TAATGAGAGA AATCATAGAA TTTCTTCCGC 
5 TTCCTCGCTC ACTGACTCGC TGCGCTCGGT CGTTCGGCTG CGGCGAGCGG TATCAGCTCA 
CTCAAAGGCG GTAATACGGT TATCCACAGA ATCAGGGGAT AACGCAGGAA AGAACATGTG 
AGCAAAAGGC CAGCAAAAGG CCAGGAACCG TAAAAAGGCC GCGTTGCTGG CGTTTTTCCA 
TAGGCTCCGC CCCCCTGACG AGCATCACAA AAATCGACGC TCAAGTCAGA GGTGGCGAAA 
CCCGACAGGA CTATAAAGAT ACCAGGCGTT TCCCCCTGGA AGCTCCCTCG TGCGCTCTCC 
10 TGTTCCGACC CTGCCGCTTA CCGGATACCT GTCCGCCTTT CTCCCTTCGG GAAGCGTGGC 
GCTTTCTCAT AGCTCACGCT GTAGGTATCT CAGTTCGGTG TAGGTCGTTC GCTCCAAGCT 
GGGCTGTGTG CACGAACCCC . CCGTTCAGCC CGACCGCTGC GCCTTATCCG GTAACTATCG 
TCTTGAGTCC AACCCGGTAA GACACGACTT ATCGCCACTG GCAGCAGCCA CTGGTAACAG 
GATTAGCAGA GCGAGGTATG TAGGCGGTGC TACAGAGTTC TTGAAGTGGT GGCCTAACTA 
15 CGGCTACACT AGAAGAACAG TATTTGGTAT CTGCGCTCTG CTGAAGCCAG TTACCTTCGG 
AAAAAGAGTT GGTAGCTCTT GATCCGGCAA ACAAACCACC GCTGGTAGCG GTGGTTTTTT 
TGTTTGCAAG CAGCAGATTA CGCGCAGAAA AAAAGGATCT CAAGAAGATC CTTTGATCTT 
TTCTACGGGG TCTGACGCTC AGTGGAACGA AAACTCACGT TAAGGGATTT TGGTCATGAG 
ATTATCAAAA AGGATCTTCA CCTAGATCCT TTTAAATTAA AAATGAAGTT TTAAATCAAT 
CTAAAGTATA TATGAGTAAA CTTGGTCTGA CAGTTACCAA TGCTTAATCA GTGAGGCACC 
TATCTCAGCG ATCTGTCTAT TTCGTTCATC CATAGTTGCC TGACTCGGGG GGGGGGGGCG 
CTGAGGTCTG CCTCGTGAAG AAGGTGTTGC TGACTCATAC CAGGCCTGAA TCGCCCCATC 
ATCCAGCCAG AAAGTGAGGG AGCCACGGTT GATGAGAGCT TTGTTGTAGG TGGACCAGTT 
GGTGATTTTG AACTTTTGCT TTGCCACGGA ACGGTCTGCG TTGTCGGGAA GATGCGTGAT 
25 CTGATCCTTC AACTCAGCAA AAGTTCGATT TATTCAACAA AGCCGCCGTC CCGTCAAGTC 
AGCGTAATGC TCTGCCAGTG TTACAACCAA TTAACCAATT CTGATTAGAA AAACTCATCG 
AGCATCAAAT GAAACTGCAA TTTATTCATA TCAGGATTAT CAATACCATA TTTTTGAAAA 
AGCCGTTTCT GTAATGAAGG AGAAAACTCA CCGAGGCAGT TCCATAGGAT GGCAAGATCC 
TGGTATCGGT CTGCGATTCC GACTCGTCCA ACATCAATAC AACCTATTAA TTTCCCCTCG 
30 TCAAAAATAA GGTTATCAAG TGAGAAATCA CCATGAGTGA CGACTGAATC CGGTGAGAAT 
GGCAAAAGCT TATGCATTTC TTTCCAGACT TGTTCAACAG GCCAGCCATT ACGCTCGTCA 
TCAAAATCAC TCGCATCAAC CAAACCGTTA TTCATTCGTG ATTGCGCCTG AGCGAGACGA 
AATACGCGAT CGCTGTTAAA AGGACAATTA CAAACAGGAA TCGAATGCAA CCGGCGCAGG 
AACACTGCCA GCGCATCAAC AATATTTTCA CCTGAATCAG GATATTCTTC TAATACCTGG 
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AATGCTGTTT 


TCCCGGGGAT 


CGCAGTGGTG 


AGTAACCATG 


CATCATCAGG 


AGTACGGATA 


AAATGCTTGA 


TGGTCGGAAG 


AGGCATAAAT 


TCCGTCAGCC 


AGTTTAGTCT 


GACCATCTCA 


TCTGTAACAT 


CATTGGCAAC 


GCTACCTTTG 


CCATGTTTCA 


GAAACAACTC 


TGGCGCATCG 


GGCTTCCCAT 


ACAATCGATA 


GATTGTCGCA 


CCTGATTGCC 




vjU LjAvjL- L CAT 


TTATACCCAT 


ATAAATCAGC 


ATCCATGTTG 


GAATTTAATC 


GCGGCCTCGA 


GCAAGACGTT 


TCCCGTTGAA 


TATGGCTCAT 


AACACCCCTT 


GTATTACTGT 


TTATGTAAGC 


AGACAGTTTT 


ATTGTTCATG 


ATGATATATT 


TTTATCTTGT 


GCAATGTAAC 


ATCAGAGATT 


TTGAGACACA 


ACGTGGCTTT 


cccccccccc 


CCATTATTGA AGCATTTATC 


AGGGTTATTG 


TCTCATGAGC 


GGATACATAT 


TTGAATGTAT 


TTAGAAAAAT 


AAACAAATAG 


GGGTTCCGCG 


CACATTTCCC 


CGAAAAGTGC 


CACCTGACGT 


CTAAGAAACC 


ATTATTATCA 


TGACATTAAC 


CTATAAAAAT 


AGGCGTATCA 


CGAGGCCCTT 


TCGTC (SEQ 


ID NO:9) . 







VI R - Vaccine vector V1R was constructed to obtain a minimum-sized 
vaccine vector without unneeded DNA sequences, which still retained the overall 
optimized heterologous gene expression characteristics and high plasmid yields that 

15 V1J and VUns afford. It was determined that (1) regions within the pUC backbone 
comprising the E. coli origin of replication could be removed without affecting 
plasmid yield from bacteria; (2) the 3-region of the kan* gene following the 
kanamycin open reading frame could be removed if a bacterial terminator was 
inserted in its place; and, (3) -300 bp from the 3'- half of the BGH terminator could 

20 be removed without affecting its regulatory function (following the original Kpnl 

restriction enzyme site within the BGH element). V1R was constructed by using PCR 
to synthesize three segments of DNA from VI Jns representing the CMVintA 
promoter/BGH terminator, origin of replication, and kanamycin resistance elements, 
respectively. Restriction enzymes unique for each segment were added to each 

25 segment end using the PCR oligomers: Sspl and Xhol for CMVintA/BGH; EcoRV 
and BamHI for the kan r gene; and, Bell and Sail for the on r. These enzyme sites 
were chosen because they allow directional ligation of each of the PCR-derived DNA 
segments with subsequent loss of each site: EcoRV and Sspl leave blunt-ended 
DNAs which are compatible for ligation while BamHI and Bell leave complementary 

30 overhangs as do Sail and Xhol. After obtaining these segments by PCR each segment 
was digested with the appropriate restriction enzymes indicated above and then 
ligated together in a single reaction mixture containing all three DNA segments. The 
5-end of the ori r was designed to include the T2 rho independent terminator 
sequence that is normally found in this region so that it could provide termination 
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information for the kanamycin resistance gene. The ligated product was confirmed by 
restriction enzyme digestion (>8 enzymes) as well as by DNA sequencing of the 
ligation junctions. DNA plasmid yields and heterologous expression using viral genes 
within V1R appear similar to VUns. The net reduction in vector size achieved was 
5 1346 bp (VI Jns = 4.86 kb; V1R = 3.52 kb). PCR oligomer sequences used to 
synthesize V1R (restriction enzyme sites are underlined and identified in brackets 
following sequence) are as follows: (1) 5 -GGTACAAATATTGGCTATTGG 
CC ATTGC AT ACG-3 ' (SEQ ID NO: 19) [Sspl]; (2) 5 -CC AC ATCTCGAGGAAC 
CGGGTC A ATTCTTCAGC ACC 3 ' (SEQIDNO:20) [Xhol] (for CMVintA/BGH 

10 segment); (3) 5 -GGTACAGATATCGGAAAGCCACGTTGTG TCTCAAAATC-3 ' 
(SEQ ID NO:21) [EcoRV]; (4) 5 -C AC A TGG ATCC GTAAT GCTCTGCCAGTGTT 
ACAACC-3' (SEQ ID NO:2) [BamHH], (for kanamycin resistance gene segment) (5) 
5-GGTAC ATG ATCA CGTAGAAAAGATCA AAGG ATCTTCTTG-3 ' (SEQ ID 
NO:23) [Bell]; (6) 5 -CCACA TGTCGAC CCGTAAA AAGGCCGCGTTGCTGG-3' 

15 (SEQ ID NO:24): [Sail], (for E. coli origin of replication). 
The nucleotide sequence of vector V1R is as follows: 





TCGCGCGTTT 


CGGTGATGAC 


GGTGAAAA.CC 


TCTGACACAT 


GCAGCTCCCG 


GAGACGGTCA 




CAGCTTGTCT 


GTAAGCGGAT 


GCCGGGAGCA 


GACAAGCCCG 


TCAGGGCGCG 


TCAGCGGGTG 




TTGGCGGGTG 


TCGGGGCTGG 


CTTAACTATG 


CGGCATCAGA 


GCAGATTGTA 


CTGAGAGTGC 


20 


ACCATATGCG 


GTGTGAAATA 


CCGCACAGAT 


GCGTAAGGAG 


AAAATACCGC 


ATCAGATTGG 




CTATTGGCCA 


TTGCATACGT 


TGTATCCATA 


TCATAATATG 


TACATTTATA 


TTGGCTCATG 




TCCAACATTA 


CCGCCATGTT 


GACATTGATT 


ATTGACTAGT 


TATTAATAGT 


AATCAATTAC 




GGGGTCATTA 


GTTCATAGCC 


CATATATGGA 


GTTCCGCGTT 


ACATAACTTA 


CGGTAAATGG 




CCCGCCTGGC 


TGACCGCCCA 


ACGACCCCCG 


CCCATTGACG 


TCAATAATGA 


CGTATGTTCC 


25 


CATAGTAACG 


CCAATAGGGA 


CTTTCCATTG 


ACGTCAATGG 


GTGGAGTATT 


TACGGTAAAC 




TGCCCACTTG 


GCAGTACATC 


AAGTGTATCA 


TATGCCAAGT 


ACGCCCCCTA 


TTGACGTCAA 




TGACGGTAAA 


TGGCCCGCCT 


GGCATTATGC 


CCAGTACATG 


ACCTTATGGG 


ACTTTCCTAC 




TTGGCAGTAC 


ATCTACGTAT 


TAGTCATCGC 


TATTACCATG 


GTGATGCGGT 


TTTGGCAGTA 




CATCAATGGG 


CGTGGATAGC 


GGTTTGACTC 


ACGGGGATTT 


CCAAGTCTCC 


ACCCCATTGA 


30 


CGTCAATGGG 


AGTTTGTTTT 


GGCACCAAAA TCAACGGGAC 


TTTCCAAAAT 


GTCGTAACAA 




CTCCGCCCCA 


TTGACGCAAA 


TGGGCGGTAG 


GCGTGTACGG 


TGGGAGGTCT 


ATATAAGCAG 




AGCTCGTTTA 


GTGAACCGTC 


AGATCGCCTG 


GAGACGCCAT 


CCACGCTGTT 


TTGACCTCCA 




TAGAAGACAC 


CGGGACCGAT 


CCAGCCTCCG 


CGGCCGGGAA 


CGGTGCATTG 


GAACGCGGAT 




TCCCCGTGCC 


AAGAGTGACG 


TAAGTACCGC 


CTATAGAGTC 


TATAGGCCCA 


CCCCCTTGGC 
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TTCTTATGCA TGCTATACTG TTTTTGGCTT 
ATAGGTGATG GTATAGCTTA GCCTATAGGT 
CTATTGGTGA CGATACTTTC CATTACTAAT 
TTATTGGCTA TATGCCAATA CACTGTCCTT 
5 AGGATGGGGT CTCATTTATT ATTTACAAAT 
CCGCAGTTTT TATTAAACAT AACGTGGGAT 
ACATGGGCTC TTCTCCGGTA GCGGCGGAGC 
CAGCGACTCA TGGTCGCTCG GCAGCTCCTT 
CAGCACGATG CCCACCACCA CCAGTGTGCC 

10 TGAAAATGAG CTCGGGGAGC GGGCTTGCAC 
GGCAGAAGAA GATGCAGGCA GCTGAGTTGT 
CGTTGCGGTG CTGTTAAC GG TGGAGGGCAG 
GCGCGCCACC AGACATAATA GCTGACAGAC 
CTGCAGTCAC CGTCCTTAGA TCTGCTGTGC 

15 CCTCCCCCGT GCCTTCCTTG ACCCTGGAAG 
ATGAGGAAAT TGCATCGCAT TGTCTGAGTA 
GGCAGCACAG CAAGGGGGAG GATTGGGAAG 
GCTCTATGGG TACCCAGGTG CTGAAGAATT 
AGGCACATCC CCTTCTCTGT GACACACCCT 

20 CACTCATAGG ACACTCATAG CTCAGGAGGG 
TTGGAGCGGT CTCTCCCTCC CTCATCAGCC 
GAAGAAATTA AAGCAAGATA GGCTATTAAG 
TGAGGAAGTA ATGAGAGAAA TCATAGAATT 
CGCTCGGTCG TTCGGCTGCG GCGAGCGGTA 

25 TCCACAGAAT CAGGGGATAA CGCAGGAAAG 
AGGAACCGTA AAAAGGCCGC GTTGCTGGCG 
CATCACAAAA ATCGACGCTC AAGTCAGAGG 
CAGGCGTTTC CCCCTGGAAG CTCCCTCGTG 
GGATACCTGT CCGCCTTTCT CCCTTCGGGA 

30 AGGTATCTCA GTTCGGTGTA GGTCGTTCGC 
GTTCAGCCCG ACCGCTGCGC CTTATCCGGT 
CACGACTTAT CGCCACTGGC AGCAGCCACT 
GGCGGTGCTA CAGAGTTCTT GAAGTGGTGG 
TTTGGTATCT GCGCTCTGCT GAAGCCAGTT 



GGGGTCTATA CACCCCCGCT TCCTCATGTT 
GTGGGTTATT GACCATTATT GACCACTCCC 
CCATAACATG GCTCTTTGCC ACAACTCTCT 
CAGAGACTGA CACGGACTCT GTATTTTTAC 
TCACATATAC AACACCACCG TCCCCAGTGC 
CTCCACGCGA ATCTCGGGTA CGTGTTCCGG 
TTCTACATCC GAGCCCTGCT CCCATGCCTC 
GCTCCTAACA GTGGAGGCCA GACTTAGGCA 
GCACAAGGCC GTGGCGGTAG GGTATGTGTC 
CGCTGACGCA TTTGGAAGAC TTAAGGCAGC 
TGTGTTCTGA TAAGAGTCAG AGGTAACTCC 
TGTAGTCTGA GCAGTACTCG TTGCTGCCGC 
TAACAGACTG TTCCTTTCCA TGGGTCTTTT 
CTTCTAGTTG CCAGCCATCT GTTGTTTGCC 
GTGCCACTCC CACTGTCCTT TCCTAATAAA 
GGTGTCATTC TATTCTGGGG GGTGGGGTGG 
ACAATAGCAG GCATGCTGGG GATGCGGTGG 
GACCCGGTTC CTCCTGGGCC AGAAAGAAGC 
GTCCACGCCC CTGGTTCTTA GTTCCAGCCC 
CTCCGCCTTC AATCCCACCC GCTAAAGTAC 
CACCAAACCA AACCTAGCCT CCAAGAGTGG 
TGCAGAGGGA GAGAAAATGC CTCCAACATG 
TCTTCCGCTT CCTCGCTCAC TGACTCGCTG 
TCAGCTCACT CAAAGGCGGT AATACGGTTA 
AACATGTGAG CAAAAGGCCA GCAAAAGGCC 
TTTTTCCATA GGCTCCGCCC CCCTGACGAG 
TGGCGAAACC CGACAGGACT ATAAAGATAC 
CGCTCTCCTG TTCCGACCCT GCCGCTTACC 
AGCGTGGCGC TTTCTCAATG CTCACGCTGT 
TCCAAGCTGG GCTGTGTGCA CGAACCCCCC 
AACTATCGTC TTGAGTCCAA CCCGGTAAGA 
GGTAACAGGA TTAGCAGAGC GAGGTATGTA 
CCTAACTACG GCTACACTAG AAGGACAGTA 
ACCTTCGGAA AAAGAGTTGG TAGCTCTTGA 
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TCCGGCAAAC AAACCACCGC TGGTAGCGGT 
CGCAGAAAAA AAGGATCTCA AGAAGATCCT 
TGGAACGAAA ACTCACGTTA AGGGATTTTG 
TAGATCCTTT TAAATTAAAA ATGAAGTTTT 

5 TGGTCTGACA GTTACCAATG CTTAATCAGT 
CGTTCATCCA TAGTTGCCTG ACTCCGGGGG 
AGGTGTTGCT GACTCATACC AGGCCTGAAT 
GCCACGGTTG ATGAGAGCTT TGTTGTAGGT 
TGCCACGGAA CGGTCTGCGT TGTCGGGAAG 

0 AGTTCGATTT ATTCAACAAA GCCGCCGTCC 
TACAACCAAT TAACCAATTC TGATTAGAAA 
TTATTCATAT CAGGATTATC . AATACC ATAT 
GAAAACTCAC CGAGGCAGTT CCATAGGATG 
ACTCGTCCAA CATCAATACA ACCTATTAAT 

5 GAGAAATCAC CATGAGTGAC GACTGAATCC 
TTCCAGACTT GTTCAACAGG CCAGCCATTA 
AAACCGTTAT TCATTCGTGA TTGCGCCTGA 
GGACAATTAC AAACAGGAAT CGAATGCAAC 
ATATTTTCAC CTGAATCAGG ATATTCTTCT 

D GCAGTGGTGA GTAACCATGC ATCATCAGGA 
GGCATAAATT CCGTCAGCCA GTTTAGTCTG 
CTACCTTTGC CATGTTTCAG AAACAACTCT 
ATTGTCGCAC CTGATTGCCC GACATTATCG 
TCCATGTTGG AATTTAATCG CGGCCTCGAG 

5 ACACCCCTTG TATTACTGTT TATGTAAGCA 
TTATCTTGTG CAATGTAACA TCAGAGATTT 
CATTATTGAA GCATTTATCA GGGTTATTGT 
TAGAAAAATA AACAAATAGG GGTTCCGCGC 
TAAGAAACCA TTATTATCAT GACATTAACC 

3 CGTC (SEQ ID NO:25) . 



GGTTTTTTTG TTTGCAAGCA GCAGATTACG 
TTGATCTTTT CTACGGGGTC TGACGCTCAG 
GTC ATGAGAT TATCAAAAAG GATCTTCACC 
AAATCAATCT AAAGTATATA TGAGTAAACT 
GAGGCACCTA TCTCAGCGAT CTGTCTATTT 
GGGGGGGCGC TGAGGTCTGC CTCGTGAAGA 
CGCCCCATCA TCCAGCCAGA AAGTGAGGGA 
GGACCAGTTG GTGATTTTGA ACTTTTGCTT 
ATGCGTGATC TGATCCTTCA ACTCAGCAAA 
CGTCAAGTCA GCGTAATGCT CTGCCAGTGT 
AACTCATCGA GCATCAAATG AAACTGCAAT 
TTTTGAAAAA GCCGTTTCTG TAATGAAGGA 
GCAAGATCCT GGTATCGGTC TGCGATTCCG 
TTCCCCTCGT CAAAAATAAG GTTATCAAGT 
GGTGAGAATG GCAAAAGCTT ATGCATTTCT 
CGCTCGTCAT CAAAATCACT CGCATCAACC 
GCGAGACGAA ATACGCGATC GCTGTTAAAA 
CGGCGCAGGA ACACTGCCAG CGCATCAACA 
AATACCTGGA ATGCTGTTTT CCCGGGGATC 
GTACGGATAA AATGCTTGAT GGTCGGAAGA 
ACCATCTCAT CTGTAACATC ATTGGCAACG 
GGCGCATCGG GCTTCCCATA CAATCGATAG 
CGAGCCCATT TATACCCATA TAAATCAGCA 
CAAGACGTTT CCCGTTGAAT ATGGCTCATA 
GACAGTTTTA TTGTTCATGA TGATATATTT 
TGAGACACAA CGTGGCTTTC CCCCCCCCCC 
CTCATGAGCG GATACATATT TGAATGTATT 
ACATTTCCCC GAAAAGTGCC ACCTGACGTC 
TATAAAAATA GGCGTATCAC GAGGCCCTTT 
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EXAMPLE 2 

Codon Optimized HIV-1 Pol and HTV-1 IA Pol Derivatives as DNA Vector Vaccines 
Synthesis ofWT-optpol and IA-opt-pol Gene - Construction of both genes were 
conducted by Midland Certified Reagent Company (Midland, TX) following 
5 established strategies. Ten double stranded oligonucleotides, ranging from 159 to 340 
bases long and encompassing the entire pol gene, were synthesized by solid state 
methods and cloned separately into pUC18. For the wt-pol gene, the fragments are as 
follows: 

Bgmn-Ecll36U half site at 282 = pJS6Al-7 

10 Pmffhalf site at #285 - Ecll36U half site at #597 = pJS6B2-5 
Sspl half site at #600 - £W136II half site at #866 = pJS6Cl-4 
Smal half site at #869 - Apal #1095 _ = pJS6Dl-4 

Apal #1095 - Kpnl #1296 = pJS6El-4 

Kpnl #1296 -Xcml #1636 = pJS6Fl-5 

15 Xcml #1636 - Nsil #1847 • = pJS6Gl-2 

Nsil #1847 - BcR half site at #2174 = pJS6Hl-14 

Bell half site at #2174 - Sacl #2333 = pJS6Il-2 

Sad #2333 - BgUl #2577 = pJS6Jl-l 

EcoRI and HindSR sequences were added upstream of each 5' end and downstream of 

20 each 3' end, respectively, to allow cloning into the EcdRI-HindlR sites of pUC18. 

The next stage of the synthesis was to consolidate these cassettes into three 
roughly equal fragments (alpha, beta, gamma) and was performed as follows: 

Alpha: The Sspl-HindHL small fragment of pJS6Cl-4 was transferred into the 
Ecll36lhHindm sites of pJS6B2-5 to give pJS6BCl-L Into the EcoKL-Pmll sites of 

25 this plasmid was inserted the EcoRl-Ecll36U small fragment of pJS6Al-7 to give 
pJS6od-8. 

Beta: The EcoRI- Apal small fragment of pJS6Dl-4 was inserted into the 
corresponding sites of pJS6El-2 to give pJS6DEl-2. Also, the EcoRl-Xcml small 
fragment of pJS6Fl-5 was inserted into the corresponding sites of pJS6Gl-2 to give 
30 pJS6FGl-L Then the EcdRl-Kpnl small fragment of pJS6DEl-2 was inserted into 
the corresponding sites of pJS6FGl-l to give pJS6pi-l. 

Gamma: The Sacl-HindSL small fragment of pJS6Jl-l was inserted into the 
corresponding sites of pJS6Il-2 to give pJS6IJl-l. This plasmid was propagated 
through E. coli SCSI 10 (dam-/dcm-) to permit subsequent cleavage at the BcR site. 
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The Bca-Hindm small fragment of the unmethylated pJS6IJM was inserted into the 
Bgm-Hindm sites of pJS6Hl-14 to give pJS6xl-l. 

The wt-pol alpha, beta, gamma were ligated into the entire sequence as 
follows: 

5 The EcoRI-Ecll36E small fragment of pJS6al-8 was inserted into the EcoW-Smal 
sites of pJS6pi-l to give pJS6aP2-l. 

Into the Nsil-HindUl sites of this plasmid was inserted the NsiJ-Hindm small 
fragment of P JS6%1-1 to give pUC18-wt-pol. This final plasmid was completely 
resequenced in both strands. 

10 To construct the entire IA-pol gene, only 3 new small fragments were 

synthesized: 

Pmll half site at #285 - £W136II half site at #597 = p JS7BM 

Kpnl #1296 -Xcml #1636 = p jS7Fl-2 

NsiJ #1847 -Bgin half site at #2174 = pJS7Hl-5 

1 5 These were then used in the same reconstruction strategy as described above to give 
pUC18-IA-pol. 

Expression Vector Construction - pUC18-wt-pol and pUC18-IA-pol were 
digested with BgM in order to isolate fragments containing the entire pol genes. V1R, 
VlJns, VlJns-tpa (Shiver, et al., 1995, Immune responses to HTV gpl20 elicited by 
DNA vaccination. In Vaccines 95 (eds. Chanock, R. M., Brown, F., Ginsberg, H.S., 
& Norrby, E.) @ pp. 95-98 ; Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, New York; see also Example Section 1) were digested with BglH. The cut 
vectors were then treated with calf intestinal alkaline phosphatase. Both wt-pol and 
IA-pol genes were ligated into cut V1R using T4 DNA ligase (16 °C, overnight). 
25 Competent DH5cc cells were transformed with aliquots of the ligation mixtures. 
Colonies were screened by restriction digestion of amplified plasmid isolates.. 
Following a similar strategy, the BglJI fragment containing the IA-pol was subcloned 
into the BglE site of VlJns. To ligate the IA-pol gene into VI Jns-tpa, the IA-pol 
gene was PCR-amplified from VIR-IA-pol using pfu polymerase and the following 
30 pair of primers: S'-GGTACAAGATCTCCGCCCCCATCTCCCCCATTGAGA-S' * 
(SEQ ID NO:26), and 5'-CCACATAGATCTGCCCGGGCTTTAGTCCTCATC-3' 
(SEQ ID NO:27). The upstream primer was designed to remove the initiation met 
codon and place the pol gene in frame with the tpa leader coding sequence from 
VlJns-tpa. The PCR product was purified from the agarose gel slab using Sigma 



20 
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DNA Purification spin columns. The purified products were digested with BglR and 
subcloned into the BgUl site of VlJns-tpa. 

Results - The codon humanized wt- and IA-pol genes were constructed via 
stepwise ligation of 10 synthetic dsDNA fragments (Ferretti, et al., 1986, Proc. Natl 
5 Acad Sci. USA 83: 599-603). For expression in mammalian systems, the IA-pol gene 
was subcloned into V1R, VlJns, and VlJns-tpa. All these vectors place the gene 
under the control of the human cytomegalovirus/intron A hybrid promoter 
(hCMVLA). The DNA sequence of the IA-pol gene and the expressed protein product 
are shown in Figure 2A-B. Subcloning into VlJns-tpa attaches the leader sequence 
10 from human tissue-specific plasminogen activator (tpa) to the N-terminus of the IA- 
pol (Pennica, et a]., 1983, Nature 301: 214-221) to allow secretion of the protein. The 
sequences of the tpa leader and the fusion junction are shown in Figure 3. 

EXAMPLE 3 

15 fflV-1 POL Vaccine - Rodent Studies 

Materials - E. coli DH5cc strain, penicillin, streptomycin, ACK lysis buffer, 
hepes, L-glutamine, RPMI1640, and ultrapure CsCl were obtained from Gibco/BRL 
(Grand Island, NY). Fetal bovine serum (FBS) was purchased from Hyclone. 
Kanamycin, Tween 20, bovine serum albumin, hydrogen peroxide (30%), 

20 concentrated sulfuric acid, P-mercaptoethanol (P-ME ), and concanavalin A were 
obtained from Sigma (St. Louis, MO). Female balb/c mice at 4-6 wks of age were 
obtained from Taconic Farms (Germantown, NY). 0.3-mL insulin syringes were 
purchased from Myoderm. 96-well flat bottomed Maxisorp plates were obtained form 
NUNC (Rochester, NY). HIV-1]ub RT p66 recombinant protein was obtained from 

25 Advanced Biotechnologies, Inc. (Columbia, MD). 20-mer peptides were synthesized 
by Research Genetics (Huntsville, AL). Horseradish peroxidase (HRP)-conjugated 
rabbit anti-mouse IgGl was obtained from ZYMED (San Francisco, CA). 1,2- 
phenylenediamine dihydrochloride (OPD) tablets was obtained from DAKO 
(Norway). Purified rat anti-mouse IFN-gamma (IgGl, clone R4-6A2), biotin- 

30 conjugated rat anti-mouse IFN-gamma (IgGl, clone XMG 1.2), and strepavi din- 
alkaline phosphatase conjugate were purchased from PharMingen (San Diego, CA). 
1-STEP NBT/BCIP dye was obtained from Pierce Chemicals (Rockford, IL). 96-well 
Multiscreen membrane plate was purchased from Millipore (France). Cell strainer 
was obtained from Becton-Dickinson (Franklin Lakes, NJ). 
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Plasmid Preparation - E. coli DH5oc cells expressing the pol plasmids were 
grown to saturation in LB broth supplemented with 100 ug/mL kanamycin. Plasmid 
were purified by standard CsCl method and solubilized in saline at concentrations 
greater than 5 mg/mL until further use. 
5 Vaccination - The plasmids were prepared in phosphate-buffered saline and 

administered into balb/c by needle injection (28-1/2G insulin syringe) of 50 uL 
aliquot into each quad muscle. VlJns-IApol was administered at 0.3, 3, 30 ug dose 
and for comparison, VI Jns-tpa-IApol was given at 30 ug dose. Immunizations were 
conducted at T=0 and T=8 wks (for select animals from the 30-ug dose cohorts). 
10 ELISA Assay - At T=12 wks, blood samples were collected by making an 

incision of a tail vein and the serum separated. Anti-RT titers were obtained 
following standard secondary antibody-based ELISA. Briefly, Maxisorp plates were 
coated by overnight incubation with 100 uL of 1 ug/mL fflV-1 RT protein (in PBS). 
The plates were washed with PBS/0.05% Tween 20 and incubated for approx. 2h with 
15 200 uL/well of blocking solution (PBS/0.05% tween/1% BSA). The blocking 

solution was decanted; 100 uL aliquot of serially diluted serum samples were added 
per well and incubated for 2 h at room temperature. The plates were washed and 100 
uL of 1/1000-diluted HRP-rabbit anti-mouse IgG were added with 1 h incubation. 
The plates were washed thoroughly and soaked with 100 uL OPD/H2O2 solution for 
20 15 min. The reaction was quenched by adding 100 uL of 0.5M H 2 S04 per well. 
OD492 readings were recorded. 

ELlspot - Spleens were collected from 5 mice/cohort at T=13-14 wks and 
pooled into a tube of 8-mL R10 medium (RPM1640, 10% FBS, 2mM L-glutamine, 
lOOU/mL Penicillin, 100 u/mL streptomycin, 10 mM Hepes, 50 uM 0-ME). 
25 Multiscreen opaque plates were coated with lOOfil/well of capture mAb (purified R4- 
6A2 diluted in PBS to 5jig/ml) at 4°C overnight. The plates were washed with 
PBS/Pen/Strep in hood and blocked with 200|J.l/weU of complete R 1 0 medium for 
37°C for at least 2 hrs. The mouse spleens were ground on steel mesh, collected into 
15ml tubes and centrifuged at 1200rpm for lOmin. The pellet was treated in ACK 
30 buffer (4ml of lysis buffer per spleen) for 5min at room temperature to lyse red blood 
cells. The cell pellet was centrifuged as before, resuspended in K-medium (5ml per 
mouse spleen), filtered through a cell strainer and counted using a hemacytometer. 
Block medium was decanted from the plates and 100nl/well of cell samples (5.0xl0e5 
cells per well) plus antigens were added. Pol-specific CD4 + cells were stimulated 
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using a mixture of previously identified two epitope-containing peptides (aa641-660, 
aa73 1-750). Antigen-specific CD8+ cells were stimulated using a pool of four 
peptide epitope-containing peptides (aa201-220, aa311-330, aa571-590, aa781-800) 
or with individual peptides. A final concentration of 4 ug/mL per peptide was used. 
5 Each splenocyte sample is tested for IFN-gamma secretion by adding the mitogen, 
concanavalin A. Plates were incubated at 37°C, 5% CO2 for 20-24 h. The plates 
were washed with PBS/0.05% Tween 20 and soaked with 100 uL/well of 5 ug/mL 
biotin-conjugated rat anti-mouse IFN- mAb (clone XMG1.2) at 4°C overnight. The 
plates were washed and soaked with 100 uL/well 1/2500 dilution of strepavidin-AP 

10 (in PBS/0.005% Tween/5%FCS) for 30 min at 37 °C. Following a wash, spots were 
developed by incubating with 100|Lil/well 1-step NBT/BCEP for 6-10 min. The plates 
were washed with water and allowed to air dry. The number of spots in each wells 
were determined using a dissecting microscope and normalized to 10e6 cells. 

Results - Single vaccination of balb/c mice with VUns-IApol is able to induce 

15 antigen-specific antibody (Figure 4) and T cell (Figure 5) responses in a dose 

response manner. IFN-gamma secretion from splenocytes can be detected from 3 and 
30 ug cohort following stimulation with pools of peptides that contain CD4+ and 
CD8+ T cell epitopes. These epitopes were identified by (1) screening 20nmer 
peptides that encompass the entire pol sequence and overlap by 10 amino acid for 

20 ability to stimulate IFN-gamma secretion from vaccinee splenocytes, and (2) 

determining the T cell type (CD4+ or CD8+) by depleting either population in an 
Elispot assay. Addition of tpa leader sequence to the pol gene is able to induce 
comparable, if not slightly higher, frequencies of pol-specific CD4+ and CD8+ cells. 
A second immunization with either VUns-IApol and VI Jns-tpa-IApol resulted in 

25 effective boosting of the immune responses. 

EXAMPLE 4 
HIV-1 Pol Vaccine - Non Human Primate Studies 
Materials - E. coli DH5a strain, penicillin, streptomycin, and ultrapure CsCl 
30 were obtained from Gibco/BRL (Grand Island, NY). Kanamycin and 

phytohemagluttinin (PHA-M) were obtained from Sigma (St. Louis, MO). 20-mer 
peptides were synthesized by SynPep (Dublin, CA) and Research Genetics 
(Huntsville, AL). 96-well Multiscreen Immobilon-P membrane plates were obtained 
from Millipore (France). Strepavidin-alkaline phosphatase conjugate were purchased 
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form Pharmingen (San Diego, CA). 1-Step NBT/BCIP dye was obtained form Pierce 
Chemicals (Rockford, EL). Rat anti-human IFN-gamma mAb and biotin-conjugated 
anti-human IFN-gamma reagent were obtained from R&D Systems (Minneapolis, 
MN). Dynabeads M-450 anti-human CD4 were obtained from Dynal (Norway). 
5 HIVp24 antigen assay was purchased from Coulter Corporation (Miami, FL). HIV- 
Iiiib RT p66 recombinant protein was obtained from Advanced Biotechnologies, Inc. 
(Columbia, MD). Plastic 8 well strips/plates, flat bottom, Maxisorp, are obtained 
from NUNC (Rochester, NY). HIV+ human serum 9711234 was obtained from 
Biological Specialty Corp. 
10 Plasmid Preparation - E. coli DH5oc cells expressing the pol plasmids were 

grown to saturation in LB supplemented with 100 ug/mL kanamycin. Plasmid were 
purified by standard CsCl method and solubilized in saline at concentrations greater 
than 5 mg/mL until further use. 

Vaccination - Cohorts of 3 rhesus macaques (approx. 5-10 kg) were 
15 vaccinated with 5 mg dose of either VI Jns-IApol or VlJns-tpa-IApol. The vaccine 
was administered by needle injection of two 0.5 mL aliquots of 5 mg/mL plasmid 
solution (in phosphate-buffered saline, pH 7.2) into both deltoid muscles. Prior to 
vaccination, the monkeys were chemically restraint with i.m. injection of 10 mg/kg 
ketamine. The animals were immunized 3x at 4 week intervals (T=0, 4, 8 wks). 
20 . Sample Collection - Blood samples were collected at T = 0, 4, 8, 12, 16, 18 

wks; sera and PBMCs were isolated using established protocols. 

ELIspot Assay - Immobilon-IP plates were coated with 100 uL/well of rat anti- 
human IFN-gamma mAb at 15 ug/mL at 4 °C overnight. The plates are then washed 
with PBS and block by adding 200 uL/well of R10 medium. 4xl0e5 peripheral blood 
25 cells were plated per well and to each well, either media or one of the pol peptide 
pools (final concentration of 4 ug/mL per peptide) or PHA, a known mitogen, is 
added to a final volume of 100 uL. Duplicate wells were set up per sample per 
antigen and stimulation was performed for 20-24 h at 37 °C. The plates are then 
washed; biotinylated anti-human IFN-gamma reagent is added (0.1 ug/mL, 100 uL 
30 per well) and allowed to incubate for overnight at 4 °C. The plates are again washed 
and 100 uL of 1:2500 dilution of the strepavidin-alkaline phosphatase reagent (in 
PBS/0.005% Tween/5% FCS) is added and allowed to incubate for 2 h at ambient 
room temperature. After another wash, spots are developed by incubating with 100 
uL/well of 1-step NBT/BCIP for 6-10 min. CD4- T cell depletion was performed by 
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adding 1 bead particle/10 cell of Dynabeads M450 anti-human CD4, prewashed with 
PBS, and incubating on the shaker at 4 °C for 30 min. The beads are fractionated 
magnetically and the unbound cells collected and quantified before plating onto the 
ELISpot assay plates ( at 4x1 0e5 cells per well). 
5 CTL Assay - Procedures for establishing bulk CTL culture with fresh or 

cryopreserved peripheral blood mononuclear cells (PBMC) are as follows. Twenty 
percent total PBMC were infected in 0.5 ml volume with recombinant vaccinia virus, 
Vac-tpaPol, respectively, at multiplicity of infection (moi) of 5 for 1 hr at 37°C, and 
then combined with the remaining PBMC sample. The cells were washed once in 10 

10 ml R-10 medium, and plated in a 12 well plate at approximately 5 to 10 x 10 6 

cells/well in 4 ml R-10 medium. Recombinant human IL-7 was added to the culture 
at the concentration of 330 U/ml. Two or three days later, one milliliter of R-10 
containing recombinant human EL-2 (100 U/ml) was added to each well. And twice 
weekly thereafter, two milliliters of cultured media were replaced with 2 ml fresh R- 

15 10 medium with rhEL-2 (100 U/ml). The lymphocytes were cultured at 37°C in the 
presence of 5% C0 2 for approximately 2 weeks, and used in cytotoxicity assay as 
described below. The effector cells harvested from bulk CTL cultures were tested 
against autologous B lymphoid cell lines (BLCL) sensitized with peptide pools. To 
prepare for the peptide-sensitized targets, the BLCL cells were washed once with 

20 R-10 medium, enumerated, and pulsed with peptide pool (about 4 to 8 /*g/ml 

concentration for each individual peptide) in 1 ml volume overnight. A mock target 
was prepared by pulsing cells with peptide-free DMSO diluent to match the DMSO 
concentration in the peptide-pulsed targets. The cells were enumerated the next 
morning, and 1 x 10 6 cells were resuspended in 0.5 ml R-10 medium. Five to ten 

25 microliters of Na Cr0 4 were added to the tubes at the same time, and the cells were 

incubated for 1 to 2 hr 37°C. The cells were then washed 3 times and resuspended at 
5xl0 4 cells/ml in R-10 medium to be used as target cells. The cultured lymphocytes 
were plated with target cells at designated effector to target (E:T) ratios in triplicates 
in 96-well plates, and incubated at 37°C for 4 hours in the presence of 5% C0 2 . A 
30 sample of 30 jtxl supernatant from each well of cell mixture was harvested onto a well 
of a Lumaplate-96 (Packard Instrument, Meriden, CT), and the plate was allowed to 
air dry overnight. The amount of Cr in the well was determined through beta- 
particle emission, using a plate counter from Packard Instrument. The percentage of 
specific lysis was calculated using the formula as: % specific lysis = (E-S) I (M-5). 
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The symbol E represents the average cpm released from target cells in the presence of 
effector cells, S is the spontaneous cpm released in the presence of medium only, and 
M is the maximum cpm released in the presence of 2% Triton X-100. 

EUSA Assay - The pol-specific antibodies in the monkeys were measured in a 
5 competitive RT EIA assay, wherein sample activity is determined by the ability to 
block RT antigen from binding to coating antibody on the plate well. Briefly, 
Maxisorp plates were coated with saturating amounts of pol positive human serum 
(97111234). 250 uL of each sample is incubated with 15 uL of 266 ng/mL RT 
recombinant protein (in RCM 563, 1% BSA, 0.1% tween, 0.1% NaN 3 ) and 20 uL of 
1 0 lysis buffer (Coulter p24 antigen assay kit) for 1 5 min at room temperature. Similar 
mixtures are prepared using serially diluted samples of a standard and a negative 
control which defines maximum RT binding. 200 uL/well of each sample and 
standard were added to the washed plate and the plate incubated 16-24 h at room 
temperature. Bound RT is quantified following the procedures described in Coulter 
15 p24 assay kit and reported in milliMerck units per mL arbitrarily defined by the 
chosen standard. 

Results - Repeated vaccinations with VI Jns-IApol induced in 1 of 3 monkeys 
(94R033) significant levels of antigen-specific T cell activation (Figure 6A-C and 
Table 2) and CTL killing of peptide-pulsed autologous cells (Figure 7A-B). A 
significant CD8+ component to the T cell responses in this animal was confirmed by 
peptide-stimulation of CD4-depleted PBMCs in an ELIspot assay (Table 2). 

Immunization with VlJns-tpa-IApol produced T cell responses from all 3 
vaccinees (Figures 6A-C, Figure 7A-B; Table 2). Two (920078, 94R028) exhibited 
bulk CTL activity and detectable CD8+ components as measured by Elispot analyses 
of CD4-depleted PBMCs. For the third monkey (920073), the activated T cells were 
largely CD4+ (Table 2). Table 3 shows the time course data on the frequency of 
IFN-gamma secreting cells (SFC/million cells) upon antigen-specific stimulation for 
monkeys vaccinated 3x with either VlJns-IApol or VlJns-tpa-IApol (5 mg dose). At 
T=18 wks, CD4-cell depletion were performed; the reported values are the number of 
30 spots per million of fractionated cells and are not corrected for the resultant 
enrichment of CD8+ T cells. PBMCs were stimulated with peptide pools that 
represent either IA pol protein (mpol-1, mpol-2) or wt Pol (wtpol-1, wtpol-2). 
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For the Elispot assay, antigen specific stimulation were performed by using 
pools of 20-mer peptide pools based on the vaccine sequence. The vaccine pol 
sequence differs from the wild-type HIV-1 sequence by 9 point mutations, thereby 
affecting 16 of the 20-mer peptides in the pool. Comparable responses were observed 
5 in the vaccinees when these peptides are replaced with those using the wild-type 
sequences. 

Four of the vaccinees gave anti-RT titers above background after 3 dosages of 
the plasmids (Table 2). 

0 TABLE 3 

Anti-RT levels in Rhesus Macaques Vaccinated 3x (4 week intervals) with 
5 mgs of VlJns-IApol or VlJns-tpa-IApol expressed in mMU/mL. 
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EXAMPLE 5 

Effect of Codon Optimization on In Vivo Expression and 
Cellular Immune Response of wt-pol 
Materials and Methods - Extraction of virus-derived pol gene - The gene for RT-IN 

20 (wt-pol; a non-codon optimized wild type pol gene derived directly from the HTV mB 
genome) was extracted andj&mplified from the HTV HOB genome using two primers, 
5'-CAG GCG AGA TCT Ate ATG GCC CCC ATT AGC CCT ATT GAG ACT ' 
GTA-3' (SEQ ID NO:29) and 5'-CAG GCG AGA TCT GCC CGG GCT TTA ATC 
CTC ATC CTG TCT ACT TGC CAC-3' (SEQ ID NO:30 ), containing BgUl sites. 

25 The reaction contained 200 nmol of each primer, 2.5 U of pfu Turbo DNA 

polymerase (Stratagene, La Jolla, CA), 0.2 mM of each dNTPs, and the template 
DNA in lOmM KC1, lOmM (NHt) 2 S0 4 , 20mM Tris-HCl pH 8.75, 2mM MgS0 4 , 
0.1% TritonX-100, O.lmg/ml bovine serum albumin (BSA). Thermocycling 
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conditions were as follows: 20 cycles of 1 min at 95 °C, 1 min at 56 °C, and 4 mins at 
72 °C with 15-min capping at 72 °C. The digested PCR fragment was subcloned into 
the BglE site of the expression plasmid VlJns (Shiver, et al., 1995, Immune responses 
to HTV gpl20 elicited by DNA vaccination. In Chanock, R. M., Brown, F., Ginsberg, 
5 H. S., and Norrby, E. (Eds.) Vaccines 95. Cold Spring Harbor Laboratory Press, Cold 
Spring Harbor, New York, pp 95-98; see also Example section 1 herein) expression 
plasmid following similar procedures as described above. The ligation mixtures were 
then used to transform competent E. coli DH5 cells and screened by PCR 
amplification of individual colonies. Sequence of the entire gene insert was 

10 confirmed. All plasmid constructs for animal immunization were purified by CsCl 
method (Sambrook, et al., 1989, Fritsch and Maniatis, T. (Eds) Molecular cloning: a 
laboratory manual Cold Spring Harbor Laboratory Press, Cold Spring Harbor). 

In vitro expression in mammalian cells - 1.5xl0 6 293 cells were transfected 
with 1 or 10 fig of VIR-wt-pol (codon optimized) and VI Jns-wt-pol (virus derived) 

15 using the Cell Phect kit and incubated for 48 h at 37 °C, 5% C0 2 , 90% humidity. 
Supernatants and cell lysates were prepared and assayed for protein content using 
Pierce Protein Assay reagent (Rockford, IL). Aliquots containing equal amounts of 
total protein were loaded unto 10-20% Tris glycine gel (Novex, San Diego, CA) along 
with the appropriate molecular weight markers. The pol product was detected using 

20 anti -serum from a seropositive patient (Scripps Clinic, San Diego, CA) diluted 1 : 1000 
and the bands developed using goat anti-human IgG-HRP (Bethyl, Montgomery,. TX) 
at 1 :2000 dilution and standard ECL reagent kit (Pharmacia LKB Biotechnology, 
Uppsala, Sweden). 

Ultrasensitive RT activity assay of pol constructs - RT activities from codon 
25 optimized wt-pol and IA pol plasmids were analyzed by the Product-Enhanced 

Reverse Transcriptase (PERT) assay using Perkin Elmer 7700, Taqman technology 
(Arnold, et al., 1999, One-step fluorescent probe product-enhanced reverse 
transcriptase assay. In McClelland, M., Pardee, A. (Eds.) Expression genetics: 
accelerated and high-throughput methods. Biotechniques Books, Natick, MA, pp. 
30 201-210). Background levels for this assay were determined using 1:100,000 dilution 
of lysates from mock (chemical treatment only, no vector) transfected 293 cells. This 
background range is set as RT/reaction tube of 0.00 to 56.28 which is taken from the 
mean value of 13.80 +/- 3 standard deviations (sd=14.16). Any individual value 
>56.28 would be considered positive for PERT assay. Cells lysates were prepared 
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similarly for the following samples: mock transfection with empty VlJns vector; no 
vector control; transfection with VI Jns-tpa-pol (codon optimized); and transfection 
with VIJns-IApol (codon optimized). Samples were serially diluted to 1:100,000 in 
PERT buffer and 24 replicates for each sample at this dilution were assayed for RT 
5 activity. 

Rodent immunization with optimized and virus-derived pol plasmids - To 
compare the immunogenic properties of wt-pol (codon optimized) and virus-derived 
pol gene, cohorts of BALB/c mice (N=10) were vaccinated with 1 p.g, 10 \xg, and 100 
jag doses of V1R- wt-pol (codon optimized) and VlJns-wt-pol plasmid (virus derived). 

10 At 5 weeks post dose 1, 5 of 10 mice per cohort were boosted with the same dose of 
plasmid they initially received. In all cases, the vaccines were suspended or diluted in 
6 mM sodium phosphate, 150 mM sodium chloride, pH 7.2, and the total dose was ' 
injected to both quadricep muscles in 50 jllL aliquots using a 0.3-mL insulin syringe 
with 28-1/2G needles (Becton-Dickinson, Franklin Lakes, NJ). 

15 Anti-RT ELISA - Anti-RT titers were obtained following standard secondary 

antibody-based ELISA. Maxisorp plates (NUNC, Rochester, NY) were coated by 
overnight incubation with 100 fxL of 1 ^g /mL HIV-1 RT protein (Advanced 
Biotechnologies, Columbia, MD) in PBS. The plates were washed with PBS/0.05% 
Tween 20 using Titertek MAP instrument (Hunstville, AL) and incubated for . 

20 approximately 2h with 200 nL/well of blocking solution (PBS/0.05% tween/1.% 

BSA). The blocking solution was decanted; 100 \xL aliquot of serially diluted serum 
samples were added per well and incubated for 2 h at room temperature. An initial 
dilution of 100-fold is performed followed by 4-fold serial dilution. The plates were 
washed and 100 \\L of 1/1000-diluted HRP-rabbit anti-mouse IgG (ZYMED, San 

25 Francisco, CA) were added with 1 h incubation. The plates were washed thoroughly 
and soaked with 100 |LiL 1,2-phenylenediamine dihydrochloride/hydrogen peroxide 
(DAKO, Norway) solution for 15 min. The reaction was quenched by adding 100 fiL 
of 0.5M H 2 S04 per well. OD 492 readings were recorded using Titertek Multiskan 
MCC/340 with S20 stacker. Endpoint titers were defined as the highest serum 

30 dilution that resulted in an absorbance value of greater than or equal to 0.1 OD492 (2.5 
times the background value). 

EUspot assay - Antigen-specific INFy-secreting cells from mouse spleens 
were detected using the ELIspot assay (Miyahira, et al., 1995, Quantification of 
antigen specific CD8+ T cells using an ELISPOT assay. /. Immunol Methods 1995, 
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181, 45-54). Typically, spleens were collected from 3-5 mice/cohort and pooled into 
a tube of 8-mL complete RPMI media (RPMI1640, 10% FBS, 2mM L-glutamine, 
lOOU/mL Penicillin, 100 u/mL streptomycin, 10 mM Hepes, 50 uM (3-ME). 
Multiscreen opaque plates (Millipore, France) were coated with 100 (xL/well of 5 
5 Hg/mL purified rat anti-mouse EFN-y IgGl, clone R4-6A2 (Pharmingen, San Diego, 
CA), in PBS at 4°C overnight. The plates were washed with 
PBS/penicillin/streptomycin in hood and blocked with 200 ^L/well of complete 
RPMI media for 37 °C for at least 2 h. The mouse spleens were ground on steel 
mesh, collected into 15ml tubes and centrifuged at 1200rpm for 10 min. The pellet 

10 was treated with 4 mL ACK buffer (Gibco/BRL) for 5 min at room temperature to 
lyse red blood cells. The cell pellet was centrifuged as before, resuspended in 
complete RPMI media (5 ml per mouse spleen), filtered through a cell strainer and 
counted using a hemacytometer. Block media was decanted from the plates and to 
each well, 100 jllL of cell samples (5xl0 5 cells per well) and 100 \xL of the antigen 

15 . solution were added. To the control well, 100 \iL of the media were added; for 
specific responses, peptide pools containing either CD4 + or CD8 + epitopes were 
added. In all cases, a final concentration of 4 \igJmL per peptide was used. Each 
sample/antigen mixture were performed in triplicate wells. Plates were incubated at 
37°C, 5% C0 2 , 90% humidity for 20-24 h. The plates were washed with PBS/0.05% 

20 Tween 20 and incubated with 100 |LiL/well of 1.25 fXg/mL biotin-conjugated rat anti- 
mouse DFN-Y mAb, clone XMG1.2 (Pharmingen) at 4°C overnight. The plates were 
washed and incubated with 100 pL/well 1/2500 dilution of strepavidin-alkaline 
phosphatase conjugate (Pharmingen) in PBS/0.005% Tween/5% FBS for 30 min at 
37 °C. Following a wash, spots were developed by incubating with 100 jil/well 1-step 

25 NBT/BCEP (Pierce Chemicals) for 6-10 min. The plates were washed with water and 
allowed to air dry. The number of spots in each well was determined using a 
dissecting microscope and the data normalized to 10 6 cell input. 

Results - In vitro expression of Pol in mammalian cells - Heterologous 
expression of the optimized wt or IA pol genes (VlR-wt-pol (codon optimized), 

30 VlJns-IApol (codon optimized), VI Jns-tpa-IApol (codon optimized)) in 293 cells 
(Figure 8) yielded a single polypeptide of correct approximate molecular size 
(90-kDa) for the RT-EST fusion product. In contrast, no expression could be detected 
by transfecting cells with 1 and 10 \xg of the VI Jns-wt-pol, which bears the virus- 
derived pol. 
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Ultrasensitive RT assay of cells transfected with Pol constructs - Table 4 
summarizes the levels of polymerase activity from mock (vector only) control, IApol 
(codon optimized)and wt-pol plasmids (codon optimized). Results indicate that the 
wild-type POL transfected cells contained RT activity approximately 4-5 logs higher 
5 than the 293 cell only baseline values. Mock transfected cells contained activity no 
higher than baseline values. The RT activity from opt-IApol-transfected cells was 
also found to be no different than baseline values; no individual reaction tube resulted 
in RT activity higher than the established cut-off value of 56. 

10 Table 4 



Sample 


Avg. RT/tube 


Standard deviation 


Minimum 


Maximum 


Vector only 


16.25 


18.52 


0.0 


42.99 


IApol (codon 


2.99 


8.01 


0.0 


35.20 . 


optimized) 










Wt-pol 


126147 


21338 


68973 


152007 


(codon 










optimized) 











Comparative immunogenicity of optimized and virus-derived pol plasmid- To 
compare the in vivo potencies of both constructs, BALB/c mice (N=10 per group) 

15 were vaccinated with escalating doses (1, 10, 100 ^ig) of either VlJns-wt-pol (virus 
derived) or VIR-wt-pol (codon optimized). At 5 wks post dose 1, 5 of 10 animals 
were randomly boosted with the same vaccine and dose they received initially. 
Figure 9 shows the geometric mean titers of the BALB/c cohorts determined at 2 wks 
past boost. No significant anti-RT titers can be observed from animals immunized 

20 with one or two doses of the wt-pol plasmid (virus derived). In contrast, animals 
vaccinated with the humanized gene construct gave cohort anti-RT titers (>1000) 
significantly above background levels at doses above 10 ug. The responses seen at 10 
and 100 ug dose of VIR-wt-pol (codon optimized) were boosted approximately 
10-fold with a second immunization, reaching titers as high as 10 6 . 

25 Spleens from all mice in each of the cohorts were collected to be analyzed for IFN-y 
secretion following stimulation with mixtures of either CD4+ peptide epitopes or 
CD8+ peptide epitopes. The results are shown in Figure 10. All wt-pol vaccinees did 
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not show any significant cellular response above the background controls. In contrast, 
strong antigen-stimulated IFN-Y secretion were observed in a dose-responsive manner 
from animals vaccinated with one or two doses of 10 or more jig of the wt-pol (codon 

* 

optimized) construct. 

The present invention is not to be limited in scope by the specific 
embodiments described herein. Indeed, various modifications of the invention in 
addition to those described herein will become apparent to those skilled in the art 
from the foregoing description. Such modifications are intended to fall within the 
scope of the appended claims. 



10 
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WHAT IS CLAIMED IS: 

1. A pharmaceutical^ acceptable DNA vaccine composition, which 
comprises: 

(a) a DNA expression vector; and, 
5 (b) a DNA molecule containing a codon optimized open reading frame 

encoding a Pol protein or inactivated Pol derivative thereof, 
wherein upon administration of the DNA vaccine to a host the Pol protein or 
inactivated Pol derivative is expressed and generates a cellular immune response 
against HIV-1 infection. 

10 

2. The DNA vaccine of claim 1 wherein the DNA molecule encodes wild 
type Pol. 

3. The DNA vaccine of claim 2 wherein the DNA molecule comprises 
1 5 the nucleotide sequence as set forth in SEQ ED NO: 1 . 

4. The DNA vaccine of claim 3 which is VlJns-wt-pol. 

5. The DNA vaccine of claim 1 wherein the DNA molecule encodes an 
20 inactivated Pol derivative which contains a nucleotide sequence encoding a human 

tissue plasminogen activator leader peptide. 

6. The DNA vaccine of claim 5 wherein the DNA molecule comprises 
the nucleotide sequence as set forth in SEQ ID NO: 5 

25 

7. The DNA vaccine of claim 6 which is VI Jns-tPA-wt-pol. 

8. The DNA vaccine of claim 1 wherein the inactivated Pol protein 
contains at least one amino acid modification within each region of the Pol protein 

30 responsible for reverse transcriptase activity, RNase H activity and integrase activity, 
such that the inactivated Pol protein shows no substantial reverse transcriptase 
activity, RNase H activity and integrase activity. 
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9. The DNA vaccine of claim 8 wherein the DNA molecule comprises 
the nucleotide sequence as set forth in SEQ ID NO:3 

10. The DNA vaccine of claim 9 which is VlJns-IAPol. 

5 

1 1 1 The DNA vaccine of claim 8 wherein the DNA molecule encodes an 
inactivated Pol derivative which contains a nucleotide sequence encoding a human 
tissue plasminogen activator leader peptide. 

10 12. The DNA vaccine of claim 1 1 wherein the DNA molecule comprises 

the nucleotide sequence as set forth in SEQ ID NO:7. 

13. The DNA vaccine of claim 7 which is VlJns-tPA-IAPol. 

15 1 4. A method for inducing an immune response against infection or 

disease caused by virulent strains of HTV which comprises administering into the 
tissue of a mammalian host a pharmaceutically acceptable DNA vaccine composition 
which comprises a DNA expression vector and a DNA molecule containing a codon . 
optimized open reading frame encoding a Pol protein or inactivated Pol derivative 

20 thereof, wherein upon administration of the DNA vaccine to the vertebrate host the * 
Pol protein or inactivated Pol derivative is expressed and generates the immune 
response. 

15. The method of claim 16 wherein the mammalian host is a human. 

25 

16. The method of claim 17 wherein the DNA vaccine is selected from the 
group consisting of VI Jns-WTPol, VI Jns-tPA-WTPol, VI Jns-IAPol and VI Jns-tPA- 
IAPol. 

30 17. A substantially purified protein which comprises an amino acid 

sequence selected from the group consisting of SEQ ID NO:4, SEQ ID NO:6, and 
SEQ ED NO:8. 
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AGATCT ACCATGGCCCCCATCTCCCCCATTGAGACTGTGCCTGTGAAGCTGAAGCCTGGCATGGATGGCCCCAAGGTGAA 
Bgl 1 1 Met A I aPro 1 1 eSerPr o 1 1 eG I uThrVo I ProVo I LysLeuLysProG I yMetAspG I yProLysVa I Ly 
1 10 20 

GCAGTGGCCCCTGACTGAGGAGAAGATCAAGGCCCTGGTGGAAATCTGCACTGAGATGGAGAAGGAGGGCAAAATCTCCA 
sG I nTrpProLeuThrG I uG I uLys 1 1 eLysAI aLeuVa IG I u 1 1 eCysThrG I uMetG I uLysG I uG I yLys 1 1 eSerL 

30 40 50 

AGATTGGCCCCGAGAACCCCTACAACACCCCTGTGTTTGCCATCAAGAAGAAGGACTCCACCAAGTGGAGGAAGCTGGTG 
ys 1 1 eG I yProG I uAsnProTy rAsnThrProVo I PheAIal I eLysLysLysAspSerThrLysTrpArgLysLeuVa I 

60 70 

GACTTCAGGGAGCTGAACAAGAGGACCCAGGACTTCTGGGAGGTGCAGCTGGGCATCCCCCACCCCGCTGGCCTGAAGAA 
AspPheArgG I uLeuAsnLysArgThrG I nAspPheTrpG I uVa IG I nLeuG I y 1 1 eProH i sProAl aG I yLeuLysLy 
80 90 100 

GAAGAAGTCTGTGACTGTGCTGGCTGTGGGGGATGCCTACTTCTCTGTGCCCCTGGATGAGGACTTCAGGAAGTACACTG 
sLysL y sSer Va I ThrVa I LeuAloVa IG I yAspA I aTyrPheSerVa I ProLeuAspG I uAspPheAr gLy sTyr Thr A 

110 120 130 

CCTTCACCATCCCCTCCATCAACAATGAGACCCCTGGCATCAGGrACCAGTACAATGTGCTGCCCCAGGGCTGGAAGGGC 
I oPheThr 1 1 eProSer 1 1 eAsnAsnG I uThrProG I y 1 1 eAr gTyrG I nTyrAsnVa 1 LeuProG I nG I yTrpLysG I y 

140 150 

TCCCCTGCCATCTTCCAGTCCTCCATGACCAAGATCCTGGAGCCCTTCAGGAAGCAGAACCCTGACATTGTGATCTACCA 
SerPr oA I o 1 1 ePheG I nSer SerMe tThr Ly s 1 1 eLeuG I uProPheAr gLysG I nAsnProAsp 1 1 eVa 1 1 1 eTyrG I 
160 170 180 

GTACATGGCTGCCCTGTATGTGGGCTCTGACCTGGAGATTGGGCAGCACAGGACCAAGATTGAGGAGCTGAGGCAGCACC 
nTyrMe tAloAloLeuTyrVa IG I ySerAspLeuG I ul I eG I yG I nH i sAr gThrLys I leGluGI uLeuAr gG I nHisL 

190 200 210 

TGCTGAGGTGGGGCCTGACCACCCCTGACAAGAAGCACCAGAAGGAGCCCCCCTTCCTGTGGATGGGCTATGAGCTGCAC 
euLeuArgTrpG I yLeuThrThrProAspLysLysHisGI nLysG I uProProPheLeuTrpMetG I yTyrG I uLeuHis 

220 230 

CCCGACAAGTGGACTGTGCAGCCCATTGTGCTGCCTGAGAAGGACTCCTGGACTGTGAATGACATCCAGAAGCTGGTGGG 
ProAspLysTrpThrVa IG I nPro 1 1 eVa I LeuProG I uLysAspSerTrpThrVa I AsnAsp 1 1 eG I nLysLeuVa IG I 
240 250 260 

CAAGCTGAACTGGGCCTCCCAAATCTACCCTGGCATCAAGGTGAGGCAGCTGTGCAAGCTGCTGAGGGGCACCAAGGCCC 
yLysLeuAsnTr pA I aSerG I n 1 1 eTyrPr oG I y 1 1 eLysVo I ArgG I nLeuCysLysLeuLeuArgG I yThrLysAI oL 

270 280 290 
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TGACTGAGG^^ 

1 uVo 1 1 1 eProLeuThrG I uG I uA I aG I uLeuG I uLeuAl aG I uAsnArgG lull eLeuLysG I uProVa IHI s 

300 310 

G I yVa I Tyr TyrAspProSerLysAspLeu 1 1 eA I oG I u 1 1 eG I nLysG I nG I yG I nG I yG I nTrpThrTyrG I nlleTy 

330 340 

rGlnGluProPheLysAsnLeuLysThrGlyLysTyrAloArgMetArgGlyAloHisThrAsnAspVolLysGlnLe^ 

350 ^60 ■ 370 

CTGAGGCTGTGCAGMGATCACCACTGAG^ 

hrGluAlaVolGlnLysI leThrThrGluSerl leVol I leTrpGlyLysThrProLysPheLysLeuProI leGlnLys 

380 390 

g!S 

Im uThrTr P Tr P Thr G I uTyrTrpG InAI aThrTrpI I eProG I uTrpGI uPheVa I AsnThrProProLe 
UU 410 420 

GGTGAAGCTGTGGTACCAGCTGGAGAAGGAGCCCA^ 

uVa I LysLeuTr pTyrG InLeuG I uLysG I uPro 1 1 eVo IG I yA I aG I uThrPheTyrVa I A I aG^SS 

430 440 — 45 5 

AGACCAAGCTGGGCMGGCT(£CTATGTGACCM(^^ 

'"ThrLysLeuG.yLysAlcfilyTyrVolThrAsnArgGlyArgG.nLysVo 

460 470 

LyslhrAlaLeuGlnAlalleTyrLeuAloLeuGlnAspSerGlyLeuGluVolAsnlleValThrAJoSerGlnTyrAI 

4Q 0 500 



P ^ r^l ^1^^^^^^^^^^^ TGAG TCTG AGCTGGTGAACCAG ATCA T TGAGCAGCTGATCAAGAAGG 
1 V f e 1 1 eG I nA I oG I nProAspG InSerG I uSerG I uLeuVa I AsnG ( n I lei I eG I uGlnLeuIle^LysLysG 
5,0 520 y 5 5o 

I uL ysVo I Ty r L euA I oTrpVo I Pr oA I oH i sLysG I y 1 1 eG I yG I yAsnG I uG I nVo I AspLysLel^a I Ser A t aG I v 

5 *0 550 

> 

'^T"^^^*^^^5^"'"^ 1 p "'^^ TCCTGGAXGGCAT TGAOAAOGCCCAGGATGACX^ATGAGAAGTACOACTCOAACTOGMOGGCTAT 
g 560 ' LeuPheLeuAs P G 1 v. 1 ' eAspLysAI aG I nAspG I uH i sG I uLysTyrH i sSerAsnTrpAr gA I aMe 

570 580 
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GGCCTCTGACTTCAACCTGCCCCCTGTGGTGGCTAAGGAGATTGTGGCCTCCTGTGACAAGTGCCAGCTGAAGGGGGAGG 
tAI aSer AspPheAsnLeuProProVo I Va I Al aLysG lull eVa I A I oSerCysAspLysCysG I nLeuLysG I yG I uA 

590 600 610 

CCATGCATGGGCAGGTGGACTGCTCCCCTGGCATCTGGCAGGTGGCCTGCACCCACCTGGAGGGCAAGGTGATCCTGGTG 
I oMetHi sG I yG I nVa I AspCysSerProG I y 1 1 eTrpG I nLeuAloCysThrH i sLeuG I uG I yLysVa 1 1 1 eLeuVo I 

620 630 

GCTGTGCATGTGGCCTCCGGCTACATTGAGGCTGAGGTGATCCCTGCTGAGACAGGCCAGGAGACTGCCTACTTCCTGCT 
A I aVa I H i sVo I A I oSerG I y Ty r 1 1 eG I uA I oG I uVo 1 1 1 eProA I aG I uThrG I yG I nG I uTh r A I aTyr PheL euLe 
640 650 660 

GAAGCTGGCTGGCAGGTGGCCTGTGAAGACCATCCACACTGCCAATGGCTCCAACTTCACTGGGGCCACAGTGAGGGCTG 
uLysLeuA I aG I yArgTrpProVa I LysThr 1 1 eHi sThr AjaAsnG I ySerAsnPheThrG I yAl aThrVa I ArgAI aA 

670 680 690 

CCTGCTGGTGGGCTGGCATCAAGCAGGAGTTTGGCATCCCCTACAACCCCCAGTCCCAGGGGGTGGTGGCCTCCATGAAC 
I oCy sTr pTr pA I oG I y 1 1 eLysG I nG I uPheG I y 1 1 ePr oTyr AsnProG I nSerG I nG I yVa I Va I AlaSerMe t Asn 

700 710 

AAGGAGCTGAAGAAGATCATTGGGCAGGTGAGGGACCAGGCTGAGCACCTGAAGACAGCTGTGCAGATGGCTGTGTTCAT 
LysG I uLeuLyslys 1 1 e 1 1 eG I yG I nVo I Ar gAspG I nA I aG I uH i sLeuLysThr A I aVa IG I nMe tA I aVa I Phe 1 1 
720 730 740 

CCACMCTTCAAGAGGMGGGGGGCATCG(X5GGCTACTCCGCTGGGGAGAGGATTGTGGACATCATTGCCACAGACATCC 
eH i sAsnPheLysAr gLysG I yG I y 1 1 eG I yG I yTyrSer Al aG I yG I uArg 1 1 eVa I Asp 1 1 el I eA I aThrAsp 1 1 eG 

750 760 770 

AGACCAAGGAGCTCCAGAAGCAGATCACCAAGATCCAGAACTTCAGGGTGTACTACAGGGACTCCAGGAACCCCCTGTGG 
I nThf LysG I uLeuG I nlysG I n 1 1 eThrLys 1 1 eG I nAsnPheArgVo I Ty rTyr ArgAspSer ArgAsnPr oLeuTrp 

780 790 

AAGGGCCCTGCCAAGCTGCTGTGGAAGGGGGAGGGGGCTGTGGTGATCCAGGACAACTCTGACATCAAGGTGGTGCCCAG 
LysG I yProAl aLysLeuLeuTr pLysG I yG I uG I yAl oVa I Va 1 1 1 eG I nAspAsnSerAspI I eLysVa I Va I ProAr 
800 810 820 

GAGGAAGGCCAAGATCATCAGGGACTATGGCAAGCAGATGGCTGGGGATGACTGTGTGGCCTCCAGGCAGGATGAGGACT 
gAr gLysAI aLys 1 1 e 1 1 eArgAspTy rG I yLysG I nMetA I aG I yAspAspCysVo I A I aSer ArgG I nAspG I uAspx 

830 840 850 

AAAGCCCGGGC AGATCT (SEQ ID NO: 3) 
Xx BglU 
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SEQUENCE LISTING 

<110> Merck & Co., Inc. 

<120> POLYNUCLEOTIDE VACCINES EXPRESSING CODON 
OPTIMIZED HIV- 1. POL AND MODIFIED HIV-1 POL 

<130> 20608Y PCT 
<160> 30 

<170> FastSEQ for Windows Version 4.0 

<210> 1 

<211> 2577 

<212> DNA 

<213> Human Immunodeficiency Virus-1 

<220> 
<221> CDS 

<222> (10) . . . (2562) 
<400> 1 

agatctacc atg gcc ccc ate tec ccc att gag act gtg cct gtg aag ctg 51 
Met Ala Pro lie Ser Pro lie Glu Thr Val Pro Val Lys Leu 
15 10 

aag cct ggc atg gat ggc ccc aag gtg aag cag tgg ccc ctg act gag 99 
Lys Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu 
15 20 25 30 

gag aag ate aag gcc ctg gtg gaa ate tgc act gag atg gag aag gag 147 
Glu Lys lie Lys Ala Leu Val Glu lie Cys Thr Glu Met Glu Lys Glu 

35 40 45 

ggc aaa ate tec aag att ggc ccc gag aac ccc tac aac acc cct gtg 195 
Gly Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val 

50 55 60 

ttt gcc ate aag aag aag gac tec acc aag tgg agg aag ctg gtg gac 243 
Phe Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp 
65 70 75 

ttc agg gag ctg aac aag agg acc cag gac ttc tgg gag gtg cag ctg 291 
Phe Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu 
80 85 _ 90 

ggc ate ccc cac ccc get ggc ctg aag aag aag aag tct gtg act gtg 339 
Gly He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val 
95 100 105 110 

ctg gat gtg ggg gat gcc tac ttc tct gtg ccc ctg gat gag gac ttc 387 
Leu Asp Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe 

115 120 125 

agg aag tac act gcc ttc acc ate ccc tec ate aac aat gag acc cct 435 
Arg Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro 

130 135 140 
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?? C ?? C » SS tac cag fcac aat gt 5 ctg ccc cag ggc tgg aag gac tec 4fn 

y tJ? ^ Gln Asn Val Leu Pro ^ Gly Trl LyI fly Ser 

145 150 155 

??» ri° ^ C S? 9 tcc fccc atg acc aag afc c ctg gag ccc ttc agg 531 
Pro Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg 
lbu 165 170 

aag cag aac cct gac att gtg ate tac cag tac atg gat gac ctg tat 579 
Lys Gin Asn Pro Asp lie Val lie Tyr Gin Tyr Met Lp Isp Leu Tyr 
1/5 180 185 190 

v»? r? C c Ct ga ° Ctg ga£f att ggg cag cac agg acc aa S att gag gag 627 
Val Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu 

195 200 



205 



t»m »H2 S aC Ctg Ctg agg tgg ggc ctg acc acc c< =t gac aag aag 675 
Leu Arg Gin His Leu Leu Arg Trp Gly Leu Thr Thr Pro Asp Lys Lys 

210 215 220 

S?= n? 9 f 39 Hf g CCC Ccc ttc ctg tgg atg g 9 c tat <J a 9 ctg cac ccc 723 
His Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro 
225 230 235 

gac aag tgg act gtg cag ccc att gtg ctg cct gag aag gac tcc too 771 
Asp Lys Trp Thr Val Gin Pro He Val Leu Pro Glu LyI Isp Ser Tr? . 



330 



819 



867 



240 245 

Vht 5«? gSC a ^° Cag aag Ctg gtg ggc aa 9 ctg aac tgg gec tcc 

Thr Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser 

255 260 265 270 

rf a S Ct ggc a ? c aag gtg agg cag ctg tgc aag ctg ctg agg 

Gin He Tyr Pro Gly He Lys Val Arg Gin Leu Cys Lys Leu Leu Arg 

275 280 285 

£ gC a £ C f 39 9CC ctg act gag gta atc ccc ctg act gag gag get aaa 91 s 

Gly Thr Lys Ala Leu Thr Glu Val He Pro Leu Thr Glu Glu Ala Glu 

290 295 300 

ctg gag ctg get gag aac agg gag atc. ctg aag gag cct gtg cat aaa o fi 7 
Leu Glu Leu Ala Glu Asn Arg Glu lie Leu Lys Glu Pro Val His Gly 
305 310 315 

val S5£ ? aC S CC * CC aag gac ctg att gcfc ^ag atc cag aag cag ion 
Val Tyr Tyr Asp Pro Ser Lys Asp Leu lie Ala Glu He Gin Lys Gin 



61v r?n rf 5 £ gg %Z C taC Caa atc tac cag gag ccc c tc aag aac 1059 
Gly Gin Gly Gin Trp Thr Tyr Gin He Tyr Gin Glu Pro Phe Lys Asn 

335 340 345 350 

f= 9 f ag t Ct S? c aag tat gcc a 9 g ato agg ggg gee cac acc aat gat 1107 
Leu Lys Tnr Gly Lys Tyr Ala Arg Met Arg Gly Ala His Thr Asn Isp 

355 360 365 

gtg aag cag ctg act gag get gtg cag aag atc acc act gag tcc att 11^ 
Val Lys Gin Leu Thr Glu Ala Val Gin Lys He Thr Thr Glu Ser He 

370 375 3 8 o 
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gtg ate tgg ggc aag acc ccc aag ttc aag ctg ccc ate cag aag gag 1203 
Val lie Trp Gly Lys Thr Pro Lys Phe Lys Leu Pro lie Gin Lys Glu 
385 390 395 

acc tgg gag acc tgg tgg act gag tac tgg cag gec acc tgg ate cct 1251 
Thr Trp Glu Thr Trp Trp Thr Glu Tyr Trp Gin Ala Thr Trp lie Pro 
400 405 410 

gag tgg gag ttt gtg aac acc ccc ccc ctg gtg aag ctg tgg tac cag 1299 
Glu Trp Glu Phe Val Asn Thr Pro Pro Leu Val Lys Leu Trp Tyr Gin 
415 420 425 430 

ctg gag aag gag ccc att gtg ggg get gag acc ttc tat gtg gat ggg 1347 
Leu Glu Lys Glu Pro lie Val Gly Ala Glu Thr Phe Tyr Val Asp Gly 

435 440 445 

get gee aac agg gag acc aag ctg ggc aag get ggc tat gtg acc aac 1395 
Ala Ala Asn Arg Glu Thr Lys Leu Gly Lys Ala Gly Tyr Val Thr Asn 

450 455 460 

agg ggc agg cag aag gtg gtg acc ctg act gac acc acc aac cag aag 1443 
Arg Gly Arg Gin Lys Val Val. Thr Leu Thr Asp Thr Thr Asn Gin Lys 
465 470 475 

act gag etc cag gec ate tac ctg gec etc cag gac tct ggc ctg gag 1491 
Thr Glu Leu Gin Ala lie Tyr Leu Ala Leu Gin Asp Ser Gly Leu Glu ' 
480 485 490 

gtg aac att gtg act gac tec cag tat gee ctg ggc ate ate cag gee 1539 
Val Asn lie Val Thr Asp Ser Gin Tyr Ala Leu Gly lie lie Gin Ala 
495 500 505 510 

cag cct gat cag tct gag tct gag ctg gtg aac cag ate att gag cag 1587 
Gin Pro Asp Gin Ser Glu Ser Glu Leu Val Asn Gin lie lie Glu Gin 

515 520 525 

ctg ate aag aag gag aag gtg tac ctg gee tgg gtg cct gec cac aag 1635 
Leu lie Lys Lys Glu Lys Val Tyr Leu Ala Trp Val Pro Ala His Lys 

530 535 540 

ggc att ggg ggc aat gag cag gtg gac aag ctg gtg tct get ggc ate 1683 
Gly He Gly Gly Asn Glu Gin Val Asp Lys Leu Val Ser Ala Gly He 
545 550 555 

agg aag gtg ctg ttc ctg gat ggc att gac aag gee cag gat gag cat 1731 
Arg Lys Val Leu Phe Leu Asp Gly He Asp Lys Ala Gin Asp Glu His 
560 565 570 

gag aag tac cac tec aac tgg agg get atg gec tct gac ttc aac ctg 1779 
Glu Lys Tyr His Ser Asn Trp Arg Ala Met Ala Ser Asp Phe Asn Leu 
575 580 585 590 

ccc cct gtg gtg get aag gag att gtg gec tec tgt gac aag tgc cag 1827 
Pro Pro Val Val Ala Lys Glu He Val Ala Ser Cys Asp Lys Cys Gin 

595 600 605 

ctg aag ggg gag gee atg cat ggg cag gtg gac tgc tec cct ggc ate 1875 
Leu Lys Gly Glu Ala Met His Gly Gin Val Asp Cys Ser Pro Gly He 

610 615 620 
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tgg cag ctg gac tgc acc cac ctg gag ggc aag gtg ate ctg gtg get 1923 
Trp Gin Leu Asp Cys Thr His Leu Glu Gly Lys Val lie Leu Val Ala 
625 630 635 

gtg cat gtg gec tec ggc tac att gag get gag gtg ate cct get gag 1971 
Val His Val Ala Ser Gly Tyr He Glu Ala Glu Val He Pro Ala Glu 
640 645 650 

aca ggc cag gag act gec tac ttc ctg ctg aag ctg get ggc agg tgg 2019 
Thr Gly Gin Glu Thr Ala Tyr Phe Leu Leu Lys Leu Ala Gly Arg Trp 
655 660 665 670 

cct gtg aag acc ate cac act gac aat ggc tec aac ttc act ggg gee 2067 
Pro Val Lys Thr He His Thr Asp Asn Gly Ser Asn Phe Thr Gly Ala 

675 680 685 

aca gtg agg get gec tgc tgg tgg get ggc ate aag cag gag ttt ggc 2115 
Thr Val Arg Ala Ala Cys Trp Trp Ala Gly He Lys Gin Glu Phe Gly 

690 695 700 

ate ccc tac aac ccc cag tec cag ggg gtg gtg gag tec atg aac aag 2163 
lie Pro Tyr Asn Pro Gin Ser Gin Gly Val Val Glu Ser Met Asn Lys 
705 710 715 

gag ctg aag aag ate att ggg cag gtg agg gac cag get gag cac ctg 2211 
Glu Leu Lys Lys He He Gly Gin Val Arg Asp Gin Ala Glu His Leu 
720 725 730 

aag aca get gtg cag atg get gtg ttc ate cac aac ttc aag agg aag 2259 
Lys Thr Ala Val Gin Met Ala Val Phe He His Asn Phe Lys Arg Lys 
735 740 745 750 

ggg ggc ate ggg ggc tac tec get ggg gag agg att gtg gac ate att 2307 
Gly Gly He Gly Gly Tyr Ser Ala Gly Glu Arg He Val Asp He He 

755 760 765 

gec aca gac ate cag acc aag gag etc cag aag cag ate acc aag ate 2355- 
Ala Thr Asp lie Gin Thr Lys Glu Leu Gin Lys Gin lie Thr Lys He 

770 775 780 

cag aac ttc agg gtg tac tac agg gac tec agg aac ccc ctg tgg aag 2403 
Gin Asn Phe Arg Val Tyr Tyr Arg Asp Ser Arg Asn Pro Leu Trp Lys 
785 790 795 

ggc cct gee aag ctg ctg tgg aag ggg gag ggg get gtg gtg ate cag 2451 
Gly Pro Ala Lys Leu Leu Trp Lys Gly Glu Gly Ala Val Val He Gin 
800 805 810 

gac aac tct gac ate aag gtg gtg ccc agg agg aag gee aag ate ate 2499 
Asp Asn Ser Asp He Lys . Val Val Pro Arg Arg Lys Ala Lys He lie 
815 820 825 830 

agg gac tat ggc aag cag atg get ggg gat gac tgt gtg gee tec agg 2547 
Arg Asp Tyr Gly Lys Gin Met Ala Gly Asp Asp Cys Val Ala Ser Arg 

835 840 845 

cag gat gag gac taa agcccgggca gatct 2577 
Gin Asp Glu Asp * 



850 



<210> 2 



-4- 
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<211> 850 
<212> PRT 

<213> Human Immunodeficiency Virus-1 



<400> 2 



Met 


Ala 


Pro 


He 


Ser 


Pro 


He 


Glu 


Thr 


Val 


Pro 


Val 


Lys Leu Lys 


Pro 


1 








5 










10 






15 




Gly Met 


Asp Gly Pro Lys 


Val 


Lys 


Gin 


Trp 


Pro 


Leu 


Thr Glu Glu 


Lys 








20 










25 








30 


lie 


Lys 


Ala 


Leu 


Val 


Glu 


He 


Cys 


Thr 


Glu 


Met 


Glu 


Lys Glu Gly Lys 






35 










40 










45 




lie 


Ser 


Lys 


He Gly Pro 


Glu 


Asn 


Pro 


Tyr Asn Thr 


Pro Val Phe 


Ala 




50 










55 










60 






lie 


Lys 


Lys 


Lys 


Asp 


Ser 


Thr 


Lys 


Trp 


Arg 


Lys 


Leu 


Val Asp Phe 


Arg 


65 










70- 










75 






80 


Glu 


Leu 


Asn 


Lys 


Arg 
85 


Thr 


Gin 


Asp 


Phe 


Trp 
90 


Glu 


Val 


Gin Leu Gly 
95 


He 


Pro 


His 


Pro 


Ala 


Gly Leu 


Lys 


Lys 


Lys 


Lys 


Ser 


Val 


Thr Val Leu 


Asp 








100 










105 








110 


Val 


Gly 


Asp 


Ala 


Tyr 


Phe 


Ser 


Val 


Pro 


Leu Asp 


Glu 


Asp Phe Arg 


Lys 






115 










120 










125 


Tyr 


Thr 


Ala 


Phe 


Thr 


He 


Pro. 


Ser 


He 


Asn 


Asn 


Glu 


Thr Pro Gly 


He 




130 










1 T C 

135 










140 




Arg 


Tyr 


Gin Tyr Asn Val 


Leu 


Pro 


Gin 


Gly Trp Lys 


Gly Ser Pro 


Ala 


145 










150 










155 




160 


He 


Phe 


Gin 


Ser 


Ser 


Met 


Thr 


Lys 


He 


Leu 


Glu 


Pro 


Phe Arg Lys Gin 










165 










170 






175 




Asn 


Pro 


Asp 


He 


Val 


He 


Tyr 


Gin 


Tyr 


Met 


Asp Asp 


Leu Tyr Val 


Gly 








180 










185 








190 


Ser 


Asp 


Leu 


Glu 


He 


Gly 


Gin 


* * * 

His 


Arg 


Thr 


Lys 


He 


Glu Glu Leu 


Arg 






195 










r\ f\ 

200 










205 


Gin 


His 


Leu 


Leu 


Arg 


Trp 


Gly 


Leu 


Thr 


Thr 


Pro Asp 


Lys Lys His 


Gin 




210 










215 










220 




Lys 


Glu 


Pro 


Pro Phe 


Leu 


Trp 


Met 


Gly 


Tyr 


Glu 


Leu 


His Pro Asp 


Lys 


225 










230 










235 




240 


Trp Thr 


Val 


Gin 


Pro 


He 


Val 


Leu 


Pro 


Glu 


Lys 


Asp 


Ser Trp Thr Val 










245 










250 




255 




Asn 


Asp 


He 


Gin 


Lys 


Leu 


Val 


Gly 


Lys 


Leu 


Asn 


Trp 


Ala Ser Gin 


He 








260 










ice 

2 65 








270 




Tyr 


Pro 


Gly 


He 


Lys 


Val 


Arg 


Gin 


t 

Leu 


Cys 


Lys 


Leu 


Leu Arg Gly Thr 






275 










z oO 










285 




Lys 


Ala 
290 


Leu 


Thr 


Glu 


Val 


He 

O Q C 


Pro 


Leu 

* 


Thr 


Glu 


Glu 
300 


Ala Glu Leu 


Glu 


Leu 


Ala 


Glu 


Asn 


Arg 


Glu 


He 


Leu 


Lys 


Glu 


Pro 


Val 


His Gly Val 


Tyr 


305 










310 










315 




320 


Tyr 


Asp 


Pro 


Ser 


Lys 


Asp 


Leu 


lie 


Ala 


Glu 


He 


Gin 


Lys Gin Gly Gin 










325 










330 






335 




Gly 


Gin 


Trp Thr Tyr Gin 


He 


Tyr 


Gin 


Glu 


Pro 


Phe 


Lys Asn Leu 


Lys 








340 










345 








350 


Thr 


Gly 


Lys Tyr Ala -Arg 


Met 


Arg 


Gly 


Ala 


His 


Thr 


Asn Asp Val 


Lys 






355 










360 










365 


Gin 


Leu 
370 


Thr 


Glu 


Ala 


Val 


Gin 
375 


Lys 


He 


Thr 


Thr 


Glu 
380 


Ser He Val 


He 


Trp 


Gly 


Lys 


Thr 


Pro 


Lys 


Phe 


Lys 


Leu 


Pro 


He 


Gin 


Lys Glu Thr 


Trp 


385 










390 










395 




400 


Glu 


Thr 


Trp Trp 


Thr 


Glu 


Tyr 


Trp 


Gin 


Ala 


Thr 


Trp 


He Pro Glu 


Trp 










405 










410 
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Glu 


Phe 


Val 
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Thr 


Pro 


Pro 


Leu 


Val 


Lys 


Leu Trp 


Tyr Gin Leu 


Glu 








420 










425 
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Lys 


Glu 


Pro 


He 


Val 


Gly 


Ala 


Glu 


Thr 


Phe 


Tyr 


Val 


Asp Gly Ala Ala 






435 










440 










445 
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Asn Arg Glu Thr Lys Leu Gly Lys Ala Gly Tyr Val Thr Asn Arg Gly 
450 455 460 

Arg Gin Lys Val Val Thr Leu Thr Asp Thr Thr Asn Gin Lys ^hr Glu 
465 470 475 480 

Leu Gin Ala lie Tyr Leu Ala Leu Gin Asp Ser Gly Leu Glu Val Asn 

485 490 495 

He Val Thr Asp Ser Gin Tyr Ala Leu Gly He He Gin Ala Gin Pro 

500 505 510 

Asp Gin Ser Glu Ser Glu Leu Val Asn Gin He He Glu Gin Leu He 

515 520 525 

Lys Lys Glu Lys Val Tyr Leu Ala Trp Val Pro Ala His Lys Gly He 

530 535 540 

Gly Gly Asn Glu Gin Val Asp Lys Leu Val Ser Ala Gly He Arg Lys 
545 550 555 560 

Val Leu Phe Leu Asp Gly He Asp Lys Ala Gin Asp Glu His Glu Lys 

565 570 575 

Tyr His Ser Asn Trp Arg Ala Met Ala Ser Asp Phe Asn Leu Pro Pro 

580 585 590 

Val Val Ala Lys Glu He Val Ala Ser Cys Asp Lys Cys Gin Leu Lys 

595 600 605 

Gly Glu Ala Met His Gly Gin Val Asp Cys Ser Pro Gly He Trp Gin 

610 615 620 

Leu Asp Cys Thr His Leu Glu Gly Lys Val He Leu Val Ala Val His 
625 630 635 640 

Val Ala Ser Gly Tyr He Glu Ala Glu Val He Pro Ala Glu Thr Gly 

645* 650 655 

Gin Glu Thr Ala Tyr Phe Leu Leu Lys Leu Ala Gly Arg Trp Pro Val 

660 665 670 

Lys Thr He His Thr Asp Asn Gly Ser Asn Phe Thr Gly Ala Thr Val 

675 680 685 

Arg Ala Ala Cys Trp Trp Ala Gly He Lys Gin Glu Phe Gly He Pro 

690 695 700 

Tyr Asn Pro Gin Ser Gin Gly Val Val Glu Ser Met Asn Lys Glu Leu 
705 710 715 720 

Lys Lys He He Gly Gin Val Arg Asp Gin Ala Glu His Leu Lys Thr 

n , 725 730 7 35 

Ala Val Gin Met Ala Val Phe He His Asn Phe Lys Arg Lys Gly Gly 

740 745 750 

He Gly Gly Tyr Ser Ala Gly Glu Arg He Val Asp He lie Ala Thr 

755 760 765 

Asp lie Gin Thr Lys Glu Leu Gin Lys Gin He Thr Lys lie Gin Asn 

770 775 780 

Phe Arg Val Tyr Tyr Arg Asp Ser Arg Asn Pro Leu Trp Lys Gly Pro 

785 790 795 "* 800 

Ala Lys Leu Leu Trp Lys Gly Glu Gly Ala Val Val He Gin Asp Asn 

805 810 815 

Ser Asp He Lys Val Val Pro Arg Arg Lys Ala Lys He lie Arg Asp 

820 825 ~ 830 

Tyr Gly Lys Gin Met Ala Gly Asp Asp Cys Val Ala Ser Arg Gin Asp 

835 840 845 

Glu Asp 
850 

<210> 3 
<211> 2577 
<212> DNA 

<213> Human Immunodeficiency Virus -1 

<220> 
<221> CDS 

<222> (10) . . . (2562) 
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<400> 3 

agatctacc atg gcc ccc ate tec ccc att gag act gtg cct gtg aag ctg 51 

Met Ala Pro lie Ser Pro lie Glu Thr" Val Pro Val Lys Leu 
15 10 

aag cct ggc atg gat ggc ccc aag gtg aag cag tgg ccc ctg act gag 99 
Lys Pro Gly Met Asp Gly Pro Lys Val Lys Gin Trp Pro Leu Thr Glu 
15 20 25 30 

gag aag ate aag gcc ctg gtg gaa ate tgc act gag atg gag aag gag 147 
Glu Lys He Lys Ala Leu Val Glu He Cys Thr Glu Met Glu Lys Glu 

35 40 45 

ggc aaa ate tec aag att ggc ccc gag aac ccc tac aac ace cct gtg 195 
Gly Lys He Ser Lys He Gly Pro Glu Asn Pro Tyr Asn Thr Pro Val 

. 50 55 " 60 

ttt gcc ate aag aag aag gac tec ace aag tgg agg aag ctg gtg gac 243 
Phe Ala He Lys Lys Lys Asp Ser Thr Lys Trp Arg Lys Leu Val Asp 
65 ' 70 75 

ttc agg gag ctg aac aag agg ace cag gac ttc tgg gag gtg cag ctg 291 
Phe Arg Glu Leu Asn Lys Arg Thr Gin Asp Phe Trp Glu Val Gin Leu 
80 85 90 

ggc ate ccc cac ccc get ggc ctg aag aag aag aag tct gtg act gtg 339 
Gly He Pro His Pro Ala Gly Leu Lys Lys Lys Lys Ser Val Thr Val- 
95 100 105 110 

ctg get gtg ggg gat gcc tac ttc tct gtg ccc ctg gat gag gac ttc 387 
Leu Ala Val Gly Asp Ala Tyr Phe Ser Val Pro Leu Asp Glu Asp Phe 

115 120 125 

agg aag tac act gcc ttc ace ate ccc tec ate aac aat gag ace cct 435 
Arg Lys Tyr Thr Ala Phe Thr He Pro Ser He Asn Asn Glu Thr Pro 

130 135 140 

ggc ate agg tac cag tac aat gtg ctg ccc cag ggc tgg aag ggc tec 483 
Gly He Arg Tyr Gin Tyr Asn Val Leu Pro Gin Gly Trp Lys Gly Ser 
145 150 155 

cct gcc ate ttc cag tec tec atg ace aag ate ctg gag ccc ttc agg 531 
Pro Ala He Phe Gin Ser Ser Met Thr Lys He Leu Glu Pro Phe Arg 
160. 165 170 

aag cag aac cct gac att gtg ate tac cag tac atg get gcc ctg tat 579 
Lys Gin Asn Pro Asp He Val He Tyr Gin Tyr Met Ala Ala Leu Tyr 
175 180 185 190 

gtg ggc tct gac ctg gag att ggg cag cac agg ace aag att gag gag 627 
Val Gly Ser Asp Leu Glu He Gly Gin His Arg Thr Lys He Glu Glu 

195 200 205 

ctg agg cag cac ctg ctg agg tgg ggc ctg ace ace cct gac aag aag 675 
Leu Arg Gin His Leu Leu Arg Trp Gly Leu Thr Thr Pro Asp Lys Lys 

210 215 220 

cac cag aag gag ccc ccc ttc ctg tgg atg ggc tat gag ctg cac ccc 723 
His Gin Lys Glu Pro Pro Phe Leu Trp Met Gly Tyr Glu Leu His Pro 
225 230 235 
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gac aag tgg act gtg cag ccc att gtg ctg cct gag aag gac tec tgg 771 
Asp Lys Trp Thr Val Gin Pro He Val Leu Pro Glu Lys Asp Ser Trp 
240 245 250 

act gtg aat gac ate cag aag ctg gtg ggc aag ctg aac tgg gec tec 819 
Thr Val Asn Asp He Gin Lys Leu Val Gly Lys Leu Asn Trp Ala Ser 
255 260 265 270 

caa ate tac cct ggc ate aag gtg agg cag ctg tgc aag ctg ctg agg 8 67 

Gin He Tyr Pro Gly He Lys Val Arg Gin Leu Cys Lys Leu Leu Arg 

275 280 285 

ggc ace aag gee ctg act gag gtg ate ccc ctg act gag gag get gag 915 
Gly Thr Lys Ala Leu Thr Glu Val He Pro Leu Thr Glu Glu Ala Glu 

290 295 300 

ctg gag ctg get gag aac agg gag ate ctg aag gag cct gtg cat ggg 963 
Leu Glu Leu Ala Glu Asn Arg Glu He Leu Lys Glu Pro Val His Gly 
305 310 315 

gtg tac tat gac ccc tec aag gac ctg att get gag ate cag aag cag 1011 
Val Tyr Tyr Asp Pro Ser Lys . Asp Leu He Ala Glu He Gin Lys Gin 
320 "* " 325 330 



ggc cag ggc cag tgg ace tac caa ate tac cag gag ccc ttc aag aac 

Gly Gin Gly Gin Trp Thr Tyr Gin He Tyr Gin Glu Pro Phe Lys Asn 

335 ^ 340 345 350 

ctg aag act ggc aag tat gee agg atg agg ggg gee cac ace aat gat 

Leu Lys Thr Gly Lys Tyr Ala Arg Met Arg Gly Ala His Thr Asn Asp 

355 360 365 



agg ggc agg cag aag gtg gtg acc ctg act gac ace ace aac cag aag 
Arg Gly Arg Gin Lys Val Val Thr Leu Thr Asp Thr Thr Asn Gin Lys 
465 470 475 



1059 



1107 



gtg aag cag ctg act gag get gtg cag aag ate acc act gag tec att 1155 
Val Lys Gin Leu Thr Glu Ala Val Gin Lys He Thr Thr Glu Ser lie 

370 375 380 

gtg ate tgg ggc aag acc ccc aag ttc aag ctg ccc ate cag aag gag 1203 
Val He Trp Gly Lys Thr Pro Lys Phe Lys Leu Pro He Gin Lys Glu 
385 ^ 390 395 

acc tgg gag acc tgg tgg act gag tac tgg cag gec acc tgg ate cct 1251 
Thr Trp Glu Thr Trp Trp Thr Glu Tyr Trp Gin Ala Thr Trp He Pro 
400 405 410 

gag tgg gag ttt gtg aac acc ccc ccc ctg gtg aag ctg tgg tac cag 1299 
Glu Trp Glu Phe Val Asn Thr Pro Pro Leu Val Lys Leu Trp Tyr Gin 
415 420 425 430 

ctg gag aag gag ccc att gtg ggg get gag acc ttc tat gtg get ggg 1347 
Leu Glu Lys Glu Pro He Val Gly Ala Glu Thr Phe Tyr Val Ala Gly 

435 440 . 445 

get gec aac agg gag acc aag ctg ggc aag get ggc tat gtg acc aac 1395 
Ala Ala Asn Arg Glu Thr Lys Leu Gly Lys „ Ala Gly Tyr Val Thr Asn 

450 455 460 



1443 
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act gcc etc cag gec ate tac ctg gec etc cag gac tct ggc ctg gag 1491 
Thr Ala Leu Gin Ala lie Tyr Leu Ala Leu Gin Asp Ser Gly Leu Glu 
480 485 490 

gtg aac att gtg act gcc tec cag tat gcc ctg ggc ate ate cag gcc 153 9 
Val Asn lie Val Thr Ala Ser Gin Tyr Ala Leu Gly lie lie Gin Ala 
495 500 505 510 

cag cct gat cag tct gag tct gag ctg gtg aac cag ate att gag cag 1587 
Gin Pro Asp Gin Ser Glu Ser Glu Leu Val Asn Gin lie lie Glu Gin 

515 520 525 



ctg ate aag aag gag aag gtg tac ctg gcc tgg gtg cct gcc cac aag 
Leu lie Lys Lys Glu Lys Val Tyr Leu Ala Trp Val Pro Ala His Lys 

530 535 540 



1635 



ggc att ggg ggc aat gag cag gtg gac aag ctg gtg tct get ggc ate 1683 
Gly lie Gly Gly Asn Glu Gin Val Asp Lys Leu Val Ser Ala. Gly lie 
545 550 555 

agg aag gtg ctg ttc ctg gat ggc att gac aag gcc cag gat gag cat 1731 
Arg Lys Val Leu Phe Leu Asp. Gly lie Asp Lys Ala Gin Asp Glu His 
560 565 570 

gag aag tac cac tec aac tgg agg get atg gcc tct gac ttc aac ctg 1779 
Glu Lys Tyr His Ser Asn Trp Arg Ala Met Ala Ser Asp Phe Asn Leu 
575 580 585 590 

ccc cct gtg gtg get aag gag att gtg gcc tec tgt gac aag tgc cag 1827 
Pro Pro Val Val Ala Lys Glu lie Val Ala Ser Cys Asp Lys Cys Gin 

595 600 605 

ctg aag ggg gag gcc atg cat ggg cag gtg gac tgc tec cct ggc ate • 1875 
Leu Lys Gly Glu Ala Met His Gly Gin Val Asp Cys Ser Pro Gly lie 

610 615 620 

tgg cag ctg gcc tgc ace cac ctg gag ggc aag gtg ate ctg gtg get 1923 
Trp Gin Leu Ala Cys Thr His Leu Glu Gly Lys Val lie Leu Val Ala 
625 630 635 

gtg cat gtg gcc tec ggc tac att gag get gag gtg ate cct get gag 1971 
Val His Val Ala Ser Gly Tyr lie Glu Ala Glu Val lie Pro Ala Glu 
640 645 650- 

aca ggc cag gag act gcc tac ttc ctg ctg aag ctg get ggc agg tgg 2019 
Thr Gly Gin Glu Thr Ala Tyr Phe Leu Leu Lys Leu Ala Gly Arg Trp 
655 660 665 670 

cct gtg aag ace ate cac act gcc aat ggc tec aac ttc act ggg gcc 2067 
Pro Val Lys Thr lie His Thr Ala Asn Gly Ser Asn Phe Thr Gly Ala 

675 680 685 

aca gtg agg get gcc tgc tgg tgg get ggc ate aag cag gag ttt ggc 2115 
Thr Val Arg Ala Ala Cys Trp Trp Ala Gly He Lys Gin Glu Phe Gly 

690 695 700 

ate ccc tac aac ccc cag tec cag ggg gtg gtg gcc tec atg aac aag 2163 
He Pro Tyr Asn Pro Gin Ser Gin Gly Val Val Ala Ser Met Asn Lys 
705 710 715 
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gag ctg aag aag ate att ggg cag gtg agg gac cag get gag cac ctg 2211 
Glu Leu Lys Lys lie lie Gly Gin Val Arg Asp Gin Ala Glu His Leu 
720 - 725 730 

aag aca get gtg cag atg get gtg ttc ate cac aac ttc aag agg aag 2259 
Lys Thr Ala Val Gin Met Ala Val Phe He His Asn Phe Lys Arg Lys 
735 740 745 750 

ggg ggc ate ggg ggc tac tec get ggg gag agg att gtg gac ate att 2307 
Gly Gly He Gly Gly Tyr Ser Ala Gly Glu Arg He Val Asp He He 

755 ~ 760 765 

gee aca gac ate cag ace aag gag etc cag aag cag ate acc aag ate 2355 
Ala Thr Asp He Gin Thr Lys Glu Leu Gin Lys Gin He Thr Lys He 

770 775 780 

cag aac ttc agg gtg tac tac agg gac tec agg aac ccc ctg tgg aag 2403 
Gin Asn Phe Arg Val Tyr Tyr Arg Asp Ser Arg Asn Pro Leu Trp Lys 
785 790 795 

ggc cct gee aag ctg ctg tgg aag ggg gag ggg get gtg gtg ate cag 2451 
Gly Pro Ala Lys Leu Leu Trp Lys Gly Glu Gly Ala Val Val He Gin 
800 805 810 

gac aac tct gac ate aag gtg gtg ccc agg agg aag gee aag ate ate 2499 
Asp Asn Ser Asp He Lys Val Val Pro Arg Arg Lys Ala Lys He lie 
815 820 825 830 

agg gac tat ggc aag cag atg get ggg gat gac tgt gtg gec tec agg 2547 
Arg Asp Tyr Gly Lys Gin Met Ala Gly Asp Asp Cys Val Ala Ser Arg 

835 840 845 

cag gat gag gac taa agcccgggca gatct 2577 
Gin Asp Glu Asp * 

850 

<210> 4 
<211> 850 

<212> PRT . . 

<213> Human Immunodeficiency Virus -1 



<400> 4 
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Val Ala Ser Gly Tyr lie Glu Ala Glu Val lie Pro Ala Glu Thr Gly 

645 650 655 

Gin Glu Thr Ala Tyr Phe Leu Leu Lys Leu Ala Gly Arg Trp Pro Val 

660 665 670 

Lys Thr lie His Thr Ala Asn Gly Ser Asn Phe Thr Gly Ala Thr Val 

675 680 685 

Arg Ala Ala Cys Trp Trp Ala Gly lie Lys Gin Glu Phe Gly He Pro 

690 695 700 

Tyr Asn Pro Gin Ser Gin Gly Val Val Ala Ser Met Asn Lys Glu Leu 
705 710 715 720 

Lys Lys He He Gly Gin Val Arg Aso Gin Ala Glu His Leu Lys Thr 

725 730 735 

Ala Val Gin Met Ala Val Phe He His Asn Phe Lys Arg Lys Gly Gly 

740 . 745 750 

He Gly Gly Tyr Ser Ala Gly Glu Arg He Val Asp He He Ala Thr 

755 760 765 

Asp He Gin Thr Lys Glu Leu Gin Lys Gin He Thr Lys He Gin Asn 

770 775 780 

Phe Arg Val Tyr Tyr Arg Asp Ser Arg Asn Pro Leu Trp Lys Gly Pro 
785 790 795 ~ 800 

Ala Lys Leu Leu Trp Lys Gly Glu Gly Ala Val Val He Gin Asp Asn 

805 810 815 

Ser Asp He Lys Val Val Pro Arg Arg Lys Ala Lys He He Arg Asp 

820 825 830 

Tyr Gly Lys Gin Met Ala Gly Asp Asp Cys Val Ala Ser Arg Gin Asp 
835 840 845 

Glu Asp 
850 

<210> 5 
<211> 2650 
<212> DNA 

<213> Human Immunodeficiency Virus-1 

<220> 
<221> CDS 

<222> (8) . . . (2635) 
<400> 5 

gatcacc atg gat gca atg aag aga ggg etc tgc tgt gtg ctg ctg ctg 49 
Met Asp Ala Met Lys Arg Gly Leu Cys Cys Val Leu Leu Leu 
1 5 10 

tgt gga gca gtc ttc gtt teg ccc age gag ate tec gee ccc ate tec 97 
Cys Gly Ala Val Phe Val Ser Pro Ser Glu He Ser Ala Pro He Ser ' 
15 20 25 30 

ccc att gag act gtg cct gtg aag ctg aag cct ggc atg gat ggc ccc 145 
Pro lie Glu Thr Val Pro Val Lys Leu Lys Pro Gly Met Asp Gly Pro 

35 40 45 

aag gtg aag cag tgg ccc ctg act gag gag aag ate aag gee ctg gtg 193 
Lys Val Lys Gin Trp Pro Leu Thr Glu Glu Lys lie Lys Ala Leu Val 

50 55 60 

gaa ate tgc act gag atg gag aag gag ggc aaa ate tec aag att ggc 241 
Glu lie Cys Thr Glu Met Glu Lys Glu Gly Lys lie Ser Lys He Gly 
65 70 75 

ccc gag aac ccc tac aac ace cct gtg ttt gee ate aag aag aag gac 289 
Pro Glu Asn Pro Tyr Asn Thr Pro Val Phe Ala lie Lys Lys Lys Aso 
80 85 90 
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tec acc aag tgg agg aag ctg gtg gac ttc agg gag ctg aac aag agg 337 
Ser Thr Lys Trp Arg Lys Leu Val Asp Phe Arg Glu Leu Asn Lys Arg 
95 100 105 no 

acc cag gac ttc tgg gag gtg cag ctg ggc ate ccc cac ccc get ggc 385 
Thr Gin Asp Phe Trp Glu Val Gin Leu Gly He Pro His Pro Ala Gly 

115 120 125 

ctg aag aag aag aag tct gtg act gtg ctg gat gtg ggg gat gec tac 433 
Leu Lys Lys Lys Lys Ser Val Thr Val Leu Asp Val Gly Asp Ala Tyr 

130 135 140 

ttc tct gtg ccc ctg gat gag gac ttc agg aag tac act gec ttc acc 481 
Phe Ser Val Pro Leu Asp Glu Asp Phe Arg Lys Tyr Thr Ala Phe Thr 
145 150 155 

ate ccc tec ate aac aat gag acc cct ggc ate agg tac cag tac aat 529 
He Pro Ser He Asn Asn Glu Thr Pro Gly He Arg Tyr Gin Tyr Asn 
160 165 170 

gtg ctg ccc cag ggc tgg aag. ggc tec cct gee ate ttc cag tec tec 577 
Val Leu Pro Gin Gly Trp Lys Gly Ser Pro Ala He Phe Gin Ser Ser 
175 180 185 190 

atg acc aag ate ctg gag ccc ttc agg aag cag aac cct gac att gtg 625 
Met Thr Lys He Leu Glu Pro Phe Arg Lys Gin Asn Pro Asp He Val 

195 200 205 

ate tac cag tac atg gat gac ctg tat gtg ggc tct gac ctg gag att 673 
He Tyr Gin Tyr Met Asp Asp Leu Tyr Val Gly Ser Asp Leu Glu He 

210 215 220 

ggg cag cac agg acc aag att gag gag ctg agg cag cac ctg ctg agg 721 
Gly Gin His Arg Thr Lys He Glu Glu Leu Arg Gin His Leu Leu Arg 
225 23 0 235 

tgg ggc ctg acc acc cct gac aag aag cac cag aag gag ccc ccc ttc 769 
Trp Gly Leu Thr Thr Pro Asp Lys Lys His Gin Lys Glu Pro Pro Phe 
240 245 250 

ctg tgg atg ggc tat gag ctg cac ccc gac aag tgg act gtg cag ccc 817 
Leu Trp Met Gly Tyr Glu Leu His Pro Asp Lys Trp Thr Val Gin Pro 
255 260 265 270 

att gtg ctg cct gag aag gac tec tgg act gtg aat gac ate cag aag 865 
He Val Leu Pro Glu Lys Asp Ser Trp Thr Val Asn Asp lie Gin Lys 

275 280 285 

ctg gtg ggc aag ctg aac tgg gee tec caa ate tac cct ggc ate aag 913 
Leu Val Gly Lys Leu Asn Trp Ala Ser Gin He Tyr Pro Gly He Lys 

290 295 300 

* 

gtg agg cag ctg tgc aag ctg ctg agg ggc acc aag gee ctg act gag 961 
Val Arg Gin Leu Cys Lys Leu Leu Arg Gly Thr Lys Ala Leu Thr Glu 
305 310 315 

gtg ate ccc ctg act gag gag get gag ctg gag ctg get gag aac agg 1009 
Val lie Pro Leu Thr Glu Glu Ala Glu Leu Glu Leu Ala Glu Asn Arg 
320 325 330 
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gag ate ctg aag gag cct gtg cat ggg gtg tac tat gac ccc tec aag 1057 
Glu He Leu Lys Glu Pro Val His Gly Val Tyr Tyr Asp Pro Ser Lys 
335 340 345 350 

gac ctg att get gag ate cag aag cag ggc cag ggc cag tgg acc tac 1105 
Asp Leu He Ala Glu He Gin Lys Gin Gly Gin Gly Gin Trp Thr Tyr 

355 360 365 

caa ate tac cag gag ccc ttc aag aac ctg aag act ggc aag tat gee 1153 
Gin He Tyr Gin Glu Pro Phe Lys Asn Leu Lys Thr Gly Lys Tyr Ala 

370 375 380 

agg atg agg ggg gee cac acc aat gat gtg aag cag ctg act gag get 1201 
Arg Met Arg Gly Ala His Thr Asn Asp Val Lys Gin Leu Thr Glu Ala 
385 390 395 



gtg cag aag ate acc act gag tec att gtg ate tgg ggc aag acc ccc 
Val Gin Lys He Thr Thr Glu Ser He Val He Trp Gly Lys Thr Pro 
400 405 410 



1249 



aag ttc aag ctg ccc ate cag aag gag acc tgg gag acc tgg tgg act 1297 
Lys Phe Lys Leu Pro He Gin Lys Glu Thr Trp Glu Thr Trp Trp Thr 
415 420 425 430 

gag tac tgg cag gee acc tgg ate cct gag tgg gag ttt gtg aac acc 1345 
Glu Tyr Trp Gin Ala Thr Trp He Pro Glu Trp Glu Phe Val Asn Thr 

435 440 445 

ccc ccc ctg gtg aag ctg tgg tac cag ctg gag aag gag ccc att gtg 1393 
Pro Pro Leu Val Lys Leu Trp Tyr Gin Leu Glu Lys Glu Pro He Val 

450 455 460 

ggg get gag acc ttc tat gtg gat ggg get gee aac agg gag acc aag 1441 
Gly Ala Glu Thr Phe Tyr Val Asp Gly Ala Ala Asn Arg Glu Thr Lys 
465 470 475 

ctg ggc aag get ggc tat gtg acc aac agg ggc agg cag aag gtg gtg 1489 
Leu Gly Lys Ala Gly Tyr Val Thr Asn Arg Gly Arg Gin Lys Val Val 
480 485 490 

acc ctg act gac acc acc aac cag aag act gag etc cag gee ate tac 1537 
Thr Leu Thr Asp Thr Thr Asn Gin Lys Thr Glu Leu Gin Ala He Tyr 
495 500 505 510 

ctg gee etc cag gac tct ggc ctg gag gtg aac att gtg act gac tec 1585 
Leu Ala Leu Gin Asp Ser Gly Leu Glu Val Asn He Val Thr Asp Ser 

515 520 525 

cag tat gec ctg ggc ate ate cag gee cag cct gat cag tct gag tct 1633 
Gin Tyr Ala Leu Gly He He Gin Ala Gin Pro Asp Gin Ser Glu Ser 

530 535 540 

gag ctg gtg aac cag ate att gag cag ctg ate aag aag gag aag gtg 1681 
Glu Leu Val Asn Gin He He Glu Gin Leu He Lys Lys Glu Lys Val 
545 550 555 

tac ctg gee tgg gtg cct gee cac aag ggc att ggg ggc aat gag cag 1729 
Tyr Leu Ala Trp Val Pro Ala His Lys Gly He Gly Gly Asn Glu Gin 
560 565 570 
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gtg gac aag ctg gtg tct get ggc ate agg aag gtg ctg ttc ctg gat 1777 
Val Asp Lys Leu Val Ser Ala Gly lie Arg Lys Val Leu Phe Leu Asp 
575 580 585 590 

ggc att gac aag gec cag gat gag cat gag aag tac cac tec aac tgg 1825 
Gly lie Asp Lys Ala Gin Asp Glu His Glu Lys Tyr His Ser Asn Trp 

595 600 605 

agg get atg gec tct gac ttc aac ctg ccc cct gtg gtg get aag gag 1873 
Arg Ala Met Ala Ser Asp Phe Asn Leu Pro Pro Val Val Ala Lys Glu 

610 615 620 

att gtg gec tec tgt gac aag tgc cag ctg aag ggg gag gec atg cat 1921 
lie Val Ala Ser Cys Asp Lys Cys Gin Leu Lys Gly Glu Ala Met His 
.625 630 635 

ggg cag gtg gac tgc tec cct ggc ate tgg cag. ctg gac tgc ace cac 1969 
Gly Gin Val Asp Cys Ser Pro Gly lie Trp Gin Leu Asp Cys Thr His 
. 640 645 650 , 

ctg gag ggc aag gtg ate ctg gtg get gtg cat gtg gec tec ggc tac . 2017 
Leu Glu Gly Lys Val lie Leu. Val Ala Val His Val Ala Ser Gly Tyr 
655 660 665 670 

att gag get gag gtg ate cct get gag aca ggc cag gag act gee tac 2065 
lie Glu Ala Glu Val He Pro Ala Glu Thr Gly Gin Glu Thr Ala Tyr 

675 680 685 

ttc ctg ctg aag ctg get ggc agg tgg cct gtg aag acc ate cac act 2113 
Phe Leu Leu Lys Leu Ala Gly Arg Trp Pro Val Lys Thr He His Thr 

690 695 700 

gac aat ggc tec aac ttc act ggg gee aca gtg agg get gec tgc tgg 2161 
Asp Asn Gly Ser Asn Phe Thr Gly Ala Thr Val Arg Ala Ala Cys Trp 
705 710 715 

tgg get ggc ate aag cag gag ttt ggc ate ccc tac aac ccc cag tec 2209 
Trp Ala Gly He Lys Gin Glu Phe Gly He Pro Tyr Asn Pro Gin Ser 
720 725 730 

cag ggg gtg gtg gag tec atg aac aag gag ctg aag aag ate att ggg 2257 
Gin Gly Val Val Glu Ser Met Asn Lys Glu Leu Lys Lys He He Gly 
735 740 745 750 

cag gtg agg gac cag get gag cac ctg aag aca get gtg cag atg get 2305 
Gin Val Arg Asp Gin Ala Glu His Leu Lys Thr Ala Val Gin Met Ala 

755 760 765* 

gtg ttc ate cac aac ttc aag agg aag ggg ggc ate ggg ggc tac tec 2353 
Val Phe He His Asn Phe Lys Arg Lys Gly Gly He Gly Gly Tyr Ser 

770 775 780 

get ggg gag agg att gtg gac ate att gee aca gac ate cag acc aag 2401 
Ala Gly Glu Arg He Val Asp He He Ala Thr Asp He Gin Thr Lys 
785 790 795 



gag etc cag aag cag ate acc aag ate cag aac ttc agg gtg tac tac 
Glu Leu Gin Lys Gin He Thr Lys He Gin Asn Phe Arg Val Tyr Tyr 
800 805 810 



2449 
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agg gac tec agg aac ccc ctg tgg aag acre cct crr<- nan r-t-rr *. 
Arg Asp ser Arg Asn Pto Leu Trp LyI I? y ^ Ila Sys £S Leu g 2497 

820 825 830 ■ 

s s ss s js a sf s a s s e k s k a !5 « 

835 840 845 

S3 S S SS S IS S £ S 5 SE « ffi S 2593 

50 855 860 

get ggg gat gac tgt gtg gec tec agg cag gat gag gac taa 

Ala Gly Asp Asp C ys Val Ala Ser A?g Gin Lo gIu It* * 2635 
865 870 * 87 5 

agcccgggca gatct 

<210> 6 
<211> 875 
<212> PRT 

<213> Human Immunodeficiency Virus-1 
<400> 6 

Met Asp Ala Met Lys Arg Gly Leu Cys Cys Val Leu Leu Leu Cys Gly 

Ala Val Phe Val Ser Pro Ser Glu lie Ser Ala Pro lie Ser Pro lie 

Glu Thr val Pro Val Lys Leu Lys Pro Gly Met Asp Gly Pro Lys Val 

Lys Gin Trp Pro Leu Thr Glu Glu Lys He Lys Ala Leu Val Glu He 

55 60 



Cys Thr Glu Met Glu Lys Glu Gly Lys He Ser Lys He Gly Pro Glu 

Asn Pro Tyr Asn Thr Pro Val Phe Ala lie lys Lys Lys Asp Ser Tnr 

Lys Trp Arg Lys Leu Val Asp Phe Arg llu Leu Asn Lys Arg Tnr Gin 

Asp Phe Trp Glu Val Gin Leu Gly He Pro His Pro Ala lly Leu Lys 

-r t 120 125 

Lys Lys Lys Ser Val Thr Val Leu Asp Val Gly Asp Ala Tyr Phe Ser 

T7 . -, Xjo 140 

Val Pro Leu Asp Glu Asp Phe Arg Lys Tyr Thr Ala Phe Thr lie Pro 

155 T c r\ 

Ser He Asn Asn Glu Thr Pro Gly He Arg Tyr Gin Tyr Asn Val Leu 



165 170 175 

Ser 
190 
Val 

Gin Tyr Met Asp Asp Leu Tyr Val Gly Ser Asp Leu III Tie Gly Gin 



Pro Gin Gly Trp Lys Gly Ser Pro Ala He Phe Gin Ser Ser Met Thr 

Lys He Leu Glu Pro Phe Arg Lys Gin Asn Pro Asp He Val He Tyr 
d.yD 200 



215 220 
225 ^ ^ Ue G1U LeU Gln His Leu Arg Trp Gly 



230 235 240 

> Phe Leu 
255 
Pro He 
270 

Lys Leu 

Gly Lys Leu Asn Trp Ala Ser Gin He Tyr Pro Gly He Lys Val Arg 
" yL 295 300 



Leu Thr Thr Pro Asp Lys Lys His Gin Lys Glu Pro Pro Phe Leu Trp" 

Leu His Pro Asp Lys Trp 

Leu Pro Glu Lys Asp Ser Trp Thr vll Asn Asp He Gin Lys Leu Val 
z § o 2 80 ""' ^ 



4<iD 250 ->erc 

Met Gly Tyr Glu Leu His Pro Asp Lys Trp Thr Val Gin Pro He Val 



2650 
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Gin 


Leu 


Cvs 


Lvs 


T.pn T .cai i Ti-v-rr C1"\\T TVl T~ 
ijcu xjgix niu \j x _y 1111 


T.vc Ala T.oii 
xiys AJ.a x_ictx 


Thr 


Glu Val He 


305 

-J v —J 










Jl J 




320 


Pro 


Leu 


Thr 


Glu 


\J J- U. IXXCL VJ X. IX JjcU VjX-l- UL 


T.^i l Ala CI ~\ 1 1 


Asn Arg Glu He 










19 R 


J JU 




335 






fill] 

uJ. LX 


XT i. 


yax HIS biy val iyl 


XyiV Aby rl U 


Ser 


Lys Asp Leu 








140 


1 






350 


Tie 
j. _l ~ 


AlCt 




lie 


Pin T -ire Pi n r*l \r Pi n 
bin LiyS bin biy bin 


Pl-ir pin my.— 

ijiy bin irp 


Thr 


Tyr Gin lie 






-a c c 
jjj 




J bU 




365 




lyi 


m n 


Glu 


Pro 


rfie i»ys Asn lieu iiys 


mr biy ixys 


Tyr Ala Arg Met 










1 *7 1; 










»aiy 


Ala 


His 


inr Asn Asp vai -uys 


bin lieu inr 


Glu 


Ala Val Gin 


JO J 










j y b 




400 




Xlc 


Thr 


Thr 


biu ber lie vai lie 


irp biy Lys 


Thr 


Pro Lys Phe 










4Ub 






415 


uys 


J_ifcr Li 


Pro 


He 


bin ±jys biu mr irp 


biu Tnr irp 


Trp 


Thr Glu Tyr 








420 








430 


irp 




Ala 


Thr 


rrp lie Pro Glu Trp 


biu pne vai 


Asn 


Thr Pro Pro 






435 




440 




445 




T .D1 1 
-Ljfc. LI 


Val 


Lys 


Leu 


irp iyr bin lieu biu 


Lys Glu Pro 


He 


Val Gly Ala 




ft D V 






A C C 

4bb 


460 






X 1XX. 


Phe Tyr 


Val ASp biy Ala Ala 


Asn Arg Glu 


Thr 


Lys Leu Gly 










/no 


475 




480 


Lys 


Ala 


Gly Tyr 


vai inr Asn Arg biy 


Arg Gin Lys 


Val 


Val Thr Leu 










<±Ob 


490 




495 


Thr 


Asp 


Thr 


Thr 


Asn bin iiys inr biu 


Leu Gin Ala 


He 


Tyr Leu Ala 








500 








510 


Leu 


Gin 


Asp 


Ser 


biy Leu Glu Val Asn 


He Val Thr 


Asp 


Ser Gin Tyr 






515 




C O f\ 

520 




525 


Ala 


Leu 


Gly 


He 


lie bin Aia Gin pro 


Asp Gin Ser 


Glu 


Ser Glu Leu 




530 








540 






Val 


Asn 


Gin 


He 


lie biu bin Leu lie 


Lys Lys Glu Lys Val Tyr Leu 


545 








c c n 

550 


555 




560 


Ala 


Trp 


Val 


Pro 


Ala His Lys Gly lie 


Gly Gly Asn 


Glu 


Gin Val Asp 










bob 


570 




575 


Lys ,Leu 


Val 


Ser 


Ala Gly lie Arg Lys 


Val Leu Phe Leu Asp Gly He 








580 


585 






590 


Asp Lys 


Ala 


Gin 


ASp blU rllS GlU LyS 


Tyr His Ser Asn Trp Arg Ala 






595 




/-v r\ 

600 




605 




Met 


Ala 


Ser Asp 


rne Asn lieu pro pro 


Val Val Ala Lys Glu He Val 




610 






bib 


620 






Ala 


Ser 


Cys Asp 


jjys uys bin lieu uys 


Gly Glu Ala Met 


His Gly Gin 


625 








DjU 


635 




640 


Val 


Asp 


Cys 


Ser 


irro Lriy xxe irp bin 


Leu Asp Cys 


Thr 


His Leu Glu 










0 4t J 


650 




655 


Gly Lys 


Val 


He 


lieu Val Ala val MIS 


Val Ala Ser Gly Tyr He Glu 








660 


£ £ ^ 

bob 






670 


Ala 


Glu 


Val 


He 


■trx.^ ni d o x u. nix. 


Gin Glu Thr 


Ala 


Tyr Phe Leu 






675 




con 




685 


Leu 


Lys 


Leu 


Ala 


biy Axvy ny rio val 


Lys Thr He His Thr Asp Asn 




690 








700 






Gly 


Ser 


Asn 


Phe 


Thr Gly Ala Thr Val 


Arg Ala Ala Cys Trp Trp Ala 


705 








710 


715 




720 


Gly 


He 


Lys 


Gin 


Glu Phe Gly He Pro 


Tyx Asn Pro 


Gin 


Ser Gin Gly 










725 


730 




735 


Val 


Val 


Glu 


Ser 


Met Asn Lys Glu Leu 


Lys Lys He 


He 


Gly Gin Val 








740 


745 






750 


Arg Asp 


Gin 


Ala 


Glu His Leu Lys Thr 


Ala Val Gin 


Met 


Ala Val Phe 






755 




760 




765 




He 


His 


Asn 


Phe 


Lys Arg Lys Gly Gly 


He Gly Gly Tyr Ser Ala Gly 




770 






775 


780 






Glu 


Arg 


He 


Val 


Asp He He Ala Thr 


Asp He Gin 


Thr 


Lys Glu Leu 


785 








790 


795 




800 
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Gin Lys Gin lie Thr Lys He Gin Asn Phe Arg Val Tyr Tyr Arg Asp 

Ser Arg Asn Pro Leu Trp Lys Gly Pro Ala Lys Leu Leu Trp Lys Gly 

°20 825 830 

Glu Gly Ala Val Val lie Gin Asp Asn Ser Asp He Lys Val Val P-o 

ajb 840 845 

Arg Arg Lys Ala Lys He He Arg Asp Tyr Gly Lys Gin Met Ala Gly 

yo ° 855 860 

Asp Asp Cys Val Ala Ser Arg Gin Asp Glu Asp 
865 870 8?5 

<210> 7 
<211> 2650 
<212> DNA 

<213> Human Immunodeficiency Virus-1 

<220> 
<221> CDS 
<222> (8) . . . (2635) 

<400> 7 

gatcacc atg gat gca atg aag aga ggg etc tgc tgt gtg ctg ctg ctg 

Met Asp Ala Met Lys Arg Gly Leu Cys Cys Val Leu Leu Leu 
1 5 io 

Cvs Glv IS vtl 11° S*? I 09 CCC agC gag atC tCC * CC ccc tec 97 

Cys Gly Ala Val Phe Val Ser Pro Ser Glu He Ser Ala Pro He Ser 

20 



25 30 



85 90 



49 



ccc att gag act gtg cct gtg aag ctg aag cct ggc atg gat age ccc 
Pro He Glu Thr Val Pro Val Lys Leu Lys Pro Gly Me? Isp Gly Pro 

35 40 45 

w= w=? f 90 Zf 3 tgg CCC Ctg act ga 9 aa 3 ate aag gec ctg gtg 193 

Lys Val Lys Gin Trp Pro Leu Thr Glu Glu Lys lie Lys Ala Leu Val 

50 55 * 6 o 



?l f n g S? g atg gag aag gag ggc aaa atc tec aag att ggc 241 

Glu He Cys Thr Glu Met Glu Lys Glu Gly Lys lie Ser Lyi He Gly 

65 70 75 

ccc gag aac ccc tac aac acc cct gtg ttt gec atc aag aag aaa aac ?sq 
Pro Glu Asn Pro Tyr Asn Thx Pro Val Phe Ala He Lyi Lyi Lyi Isp. 



337 



t^. SJ* f ag £ gg agg aa 9 cfc 9 g fc 9 g a c ttc agg gag ctg aac aag agg 
Ser Thr Lys Trp Arg Lys Leu Val Asp Phe Arg Glu Leu Asn Lys Hg 

95 100 105 Y n 9 

of 9 ? aC t 5 C fcgg gag gtg ca 9 ct 9 a tc ccc cac ccc get acre «s 

Thr Gin Asp Phe Trp Glu Val Gin Leu Gly lie Pro His Pro Ala lly 385 

115 120 X25 



ctg aag aag aag aag tct gtg act gtg ctg get gtg ggg gat acc tac aii 
Leu Lys Lys Lys Lys Ser Val Thr Val Leu Ala Val 61? Isp Ala Tyr 

130 135 ' 140 

ttc tct gtg ccc ctg gat gag gac ttc agg aag tac act gee ttc acc ari 
Phe Ser yal Pro Leu Asp Glu Asp Phe Arg Lyi Tyr Thr Ah Phe Thr 
J.4D 150 " " "~ 



155 
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ate ccc tec ate aac aat gag ace cct ggc ate agg tac cag tac aat 529 
lie Pro Ser He Asn Asn Glu Thr Pro Gly He Arg Tyr Gin Tyr Asn 
160 165 170 

gtg ctg ccc cag ggc tgg aag ggc tec cct gee ate ttc cag tec tec 577 
Val Leu Pro Gin Gly Trp Lys Gly Ser Pro Ala He Phe Gin Ser Ser 
175 180 185 190 

atg acc aag ate ctg gag ccc ttc agg aag cag aac cct gac att gtg 625 
Met Thr Lys He Leu Glu Pro Phe Arg Lys Gin Asn Pro Asp He Val 

195 200 205 

ate tac cag tac atg get gee ctg tat gtg ggc tct gac ctg gag att 673 
He Tyr Gin Tyr Met Ala Ala Leu Tyr Val Gly Ser Asp Leu Glu He 

210 215 220 

ggg cag cac agg acc aag att gag gag ctg agg cag cac ctg ctg agg 721 
Gly Gin His Arg Thr Lys He Glu Glu Leu Arg Gin His Leu Leu Arg 
225 230 235 

tgg ggc ctg acc acc cct gac aag aag cac cag aag gag ccc ccc ttc 769 
Trp Gly Leu Thr Thr Pro Asp Lys Lys His Gin Lys Glu Pro Pro Phe 
240 245 250 

ctg tgg atg ggc tat gag ctg cac ccc gac aag tgg act gtg cag ccc 817 
Leu Trp Met Gly Tyr Glu Leu His Pro Asp Lys Trp Thr Val Gin Pro • 
255 260 265 270 

att gtg ctg cct gag aag gac tec tgg act gtg aat gac ate cag aag 865 
He Val Leu Pro Glu Lys Asp Ser Trp Thr Val Asn Asp He Gin Lys 

275 280 285 

ctg gtg ggc aag ctg aac tgg gee tec caa ate tac cct ggc ate aag 913 
Leu Val Gly Lys Leu Asn Trp Ala Ser Gin He Tyr Pro Gly He Lys 

290 295 300 

gtg agg cag ctg tgc aag ctg ctg agg ggc acc aag gee ctg act gag 961 
Val Arg Gin Leu Cys Lys Leu Leu Arg Gly Thr Lys Ala Leu Thr Glu 
305 310 315 

gtg ate ccc ctg act gag gag get gag ctg gag ctg get gag aac agg 1009 
Val He Pro Leu Thr Glu Glu Ala Glu Leu Glu Leu Ala Glu Asn Arg 
320 325 330 

gag ate' ctg aag gag cct gtg cat ggg gtg tac tat gac ccc tec aag 1057 
Glu He Leu Lys Glu Pro Val His Gly Val Tyr Tyr Asp Pro Ser Lys 
335 340 345 350 

gac ctg att get gag ate cag aag cag ggc cag ggc cag tgg acc tac 1105 
Asp Leu He Ala Glu He Gin Lys Gin Gly Gin Gly Gin Trp Thr Tyr 

355 360 365 

caa ate tac cag gag ccc ttc aag aac ctg aag act ggc aag tat gee 1153 
Gin He Tyr Gin Glu Pro Phe Lys Asn Leu Lys Thr Gly Lys Tyr Ala 

370 375 380 

a 5£f atg agg ggg gee cac acc aat gat gtg aag cag ctg act gag get 1201 
Arg Met Arg Gly Ala His Thr Asn Asp Val Lys Gin Leu Thr Glu Ala 
385 390 ' 395 
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Sa? ? 5 a ? t g ? g tCC att gtg atc tgg ggc aag acc ccc 1249 

Vai Gin Lys lie Thr Thr Glu Ser He Val He Trp Gly Lys Thr Pro 

400 405 410 

aag ttc aag ctg ccc atc cag aag gag acc tgg gag acc tgg tgg act 1297 
Lys Phe Lys Leu Pro He Gin Lys Glu Thr Trp Glu Thr Trp TrS Thr 
415 4 20 425 430 

gag tac tgg cag gcc acc tgg atc cct gag tgg gag ttt gtg aac acc 1345 
Glu Tyr Trp Gin Ala Thr Trp He Pro Glu Trp Glu Phe Val Asn Thr 

435 440 445 

f tg 5*? f ag ° tg tgg taC Cag Ctg gag aag gag ccc att 5tg 1393 
Pro Pro Leu Val Lys Leu Trp Tyr Gin Leu Glu Lys Glu Pro He Val 

4 50 455 460 

n? g ?? fc of g SS C fc £ c tat gtg gct ggg gct gcc aac a ^ gag acc aag 1441 
Gly Ala Glu Thr Phe Tyr Val Ala Gly Ala Ala Asn Arg Glu Thr Lys 
465 470 475 

ctg ggc aag gct ggc tat gtg acc aac agg ggc agg cag aag gtg gtg 1489 
Leu Gly Lys Ala Gly Tyr Val . Thr Asn Arg Gly Arg Gin LyI Val Val 
480 485 490 

? tg fv Ct gaC a ? C aCC aac cag aag acfc gcc ctc ca g g c <= atc tac 1537 
Thr Leu Thr Asp Thr Thr Asn Gin Lys Thr Ala Leu Gin Ala He Tyr 

495 500 505 510 

ctg gcc etc cag gac tct ggc ctg gag gtg aac att gtg act gcc tec 1585 
Leu Ala Leu Gin Asp Ser Gly Leu Glu Val Asn He Val Thr Ala Ser 

515 520 525 

cag tat gcc ctg ggc atc atc cag gcc cag cct gat cag tct gag tct ■ 1633 
Gin Tyr Ala Leu Gly He He Gin Ala Gin Pro Asp Gin Ser Glu Ser 

530 535 ~ 540 

gag ctg gtg aac cag atc att gag cag ctg atc aag aag gag aag gtg 1681 
Glu Leu Val Asn Gin He lie Glu Gin Leu He Lys Lys Glu Lyi Val 
545 550 555 

tac ctg gcc tgg gtg cct gcc cac aag ggc att ggg ggc aat gag cag 1729 
Tyr Leu Ala Trp Val Pro Ala His Lys Gly He Gly Gly Asn Glu Gin 
560 565 570 

gtg gac aag ctg gtg tct gct ggc atc agg aag gtg ctg ttc ctg gat 1777 
Val Asp Lys Leu Val Ser Ala Gly He Arg Lys Val Leu Phe hen Asp 
575 580 585 590 

ggc att gac aag gcc cag gat gag cat gag aag tac cac tec aac tgg 
Gly He Asp Lys Ala Gin Asp Glu His Glu Lys Tyr His Ser Asn Trp 

595 600 605 

a9g S fc ? g f C tct gac ttc aac ctg ccc cct gtg gtg get aag gag 1873 
Arg Ala Met Ala Ser Asp Phe Asn Leu Pro Pro Val Val Ala Lys Glu 

610 615 620 

t?! S*? n CC tgt gac aag tgc cag cfc g aa g ggg gag gcc atg cat 1921 

lie val Ala Ser Cys Asp Lys Cys Gin Leu Lys Gly Glu Ala Met His 
625 630 635 



1825 
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ggg cag gtg gac tgc tec cct ggc ate tgg cag ctg gec tgc acc cac 1969 

Gly Gin Val Asp Cys Ser Pro Gly lie Trp Gin Leu Ala Cys Thr His 
640 645 650 

ctg gag ggc aag gtg ate ctg gtg get gtg cat gtg gee tec ggc tac 2017 

Leu Glu Gly Lys Val lie Leu Val Ala Val His Val Ala Ser Gly Tyr 

655 660 665 670 

att gag get gag gtg ate cct get gag aca ggc cag gag act gee tac 2065 

He" Glu Ala Glu Val lie Pro Ala Glu Thr Gly Gin Glu Thr Ala Tyr 

675 680 685 

ttc ctg ctg aag ctg get ggc agg tgg cct gtg aag acc ate cac act 2113 

Phe Leu Leu Lys Leu Ala Gly Arg Trp Pro Val Lys Thr lie His Thr 

690 695 700 

gee aat ggc tec aac ttc act ggg gee aca gtg agg get gee tgc tgg 2161 

Ala Asn Gly Ser Asn Phe Thr Gly Ala Thr Val Arg Ala Ala Cys Trp 

705 710 715 

tgg get ggc ate aag cag gag ttt ggc ate ccc tac aac ccc cag tec 2209 

Trp Ala Gly lie Lys Gin Glu Phe Gly lie Pro Tyr Asn Pro Gin Ser 
720 725 730 

cag ggg gtg gtg gee tec atg aac aag gag ctg aag aag ate att ggg 2257 

Gin Gly Val Val Ala Ser Met Asn Lys Glu Leu Lys Lys lie lie Gly . 

735 740 745 750 

cag gtg agg gac cag get gag cac ctg aag aca get gtg cag atg get 2305 

Gin Val Arg Asp Gin Ala Glu His Leu Lys Thr Ala Val Gin Met Ala . 

755 760 765 

gtg ttc ate cac aac ttc aag agg aag ggg ggc ate ggg ggc tac tec 2353 

Val Phe lie His Asn Phe Lys Arg Lys Gly Gly lie Gly Gly Tyr Ser 

770 775 780 

get ggg gag agg att gtg gac ate att gee aca gac ate cag acc aag 2401 

Ala Gly Glu Arg He Val Asp He He Ala Thr Asp He Gin Thr Lys 

785 790 795 

gag etc cag aag cag ate acc aag ate cag aac ttc agg gtg tac tac 2449 

Glu Leu Gin Lys Gin He Thr Lys He Gin Asn Phe Arg Val Tyr Tyr 
800 805 810 

agg gac tec agg aac ccc ctg tgg aag ggc cct gee aag ctg ctg tgg 2497 

Arg Asp Ser Arg Asn Pro Leu Trp Lys Gly Pro Ala Lys Leu Leu Trp 

815 820 825 830 

aag ggg gag ggg get gtg gtg ate cag gac aac tct gac ate aag gtg 2545 
Lys Gly Glu Gly Ala Val Val He Gin Asp Asn Ser Asp He Lys Val 

835 840 845 

gtg ccc agg agg aag gee aag ate ate agg gac tat ggc aag cag atg 2593 
Val Pro Arg Arg Lys Ala Lys He He Arg Asp Tyr Gly Lys Gin Met 

850 855 860 

get ggg gat gac tgt gtg gee tec agg cag gat gag gac taa 2635 
Ala Gly Asp Asp Cys Val Ala Ser Arg Gin Asp Glu Asp * 

865 870 875 

agcccgggca gatct 2650 
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Lys Trp Arg Lys Leu Val Asp Phe Arg Glu Leu Asia Lys Arg Thr Gin 



<210> 8 
<211> 875 
<212> PRT 

<213> Human Immunodeficiency Virus -1 
<400> 8 

Met Asp Ala Met Lys Arg Gly Leu Cys Cys Val Leu Leu Leu Cys Gly 

Ala Val Phe Val Ser Pro Ser Glu He Ser Ala Pro lie Ser Pro lie 

Glu Thr Val Pro Val Lys Leu Lys Pro Gly Met Asp Gly Ho Lys Val 

Lys Gin Trp Pro Leu Thr Glu Glu Lys He Lys Ala tlu Val Glu He 

Cys Thr Glu Met Glu Lys Glu Gly Lys lie Ser Lys He Gly Pro Glu 
° „ _ 70 75 nn 

Asn Pro Tyr Asn Thr Pro Val Phe Ala lie Lys Lys Lys Asp Ser Thr 

85 go 95 

Th: 

Asp Phe Trp Glu Val Gin Leu Gly Ue Pro His Pro Ala Gly Leu Lys 

Lys Lys Lys Ser Val Thr Val Leu Ala Val Gly Asp Ala Tyr Phe Ser 

Val Pro Leu Asp Glu Asp Phe Arg Lys Tyr Thr Ala Phe Thr He Pro 

150 155 1fin 

Ser He Asn Asn Glu Thr Pro Gly lie Arg Tyr Gin Tyr Asn Val Leu 

165 170 175 

Met 

Lys He Leu Glu Pro Phe Arg Lys Gin Asn Pro Asp He Val He Tyr 
~ n 200 205 

Gin Tyr Met Ala Ala Leu Tyr Val Gly Ser Asp Leu Glu He Gly Gin 

His Arg Thr Lys He Glu Glu Leu Arg Gin His Leu Leu Arg Trp Gly 

Leu Thr Thr Pro Asp Lys Lys His Gin Lys Glu Pro Pro Phe Leu Trp 

245 250 nrr * 

Met Gly Tyr Glu Leu His Pro Asp Lys Trp Thr Val Gin Pro He Val 

T _ 250 265 270 

Leu Pro Glu Lys Asp Ser Trp Thr Val Asn Asp He Gin Lys Leu Val 

~i 75 280 285 

Gly Lys Leu Asn Trp Ala Ser Gin He Tyr Pro Gly He Lys Val Arg 

Gin Leu Cys Lys Leu Leu Arg Gly Thr Lys Ala Leu Thr Glu Val He 

Pro Leu Thr Glu Glu Ala Glu Leu Glu Leu lit Glu Asn Arg Glu He 

^ J ^\ ~\ /"A 



Pro Gin Gly Trp Lys Gly Ser Pro Ala He Phe Gin Ser Ser Met Thr 



Leu Lys Glu Pro Val His Gly Val Tyr Tyr Asp Pro Ser Lys As P Leu 

345 "5 en 

He Ala Glu He Gin Lys Gin Gly Gin Gly Gin Trp Thr Tyr Gin He 

Tyr Gin Glu Pro Phe Lys Asn Leu Lys Thr Gly Lys Tyr Ala Arg Met 

375 380 
Arg Gly Ala His Thr Asn Asp Val Lys Gin Leu Thr Glu Ala Val Gin 

Lys He Thr Thr Glu Ser He Val He Trp Gly Lys Thr Pro Lys Phe 

405 410 41 c 

Lys Leu Pro lie Gin Lys Glu Thr Trp Glu Thr Trp Tro Thr Glu Tyr 

420 425 ~ * 430 

Trp Gin Ala Thr Trp He Pro Glu Trp Glu Phe Val Asn Thr Pro Pro 

440 445 



-22- 



WO 01/45748 PCT/US00/34724 



ucu v ax 


Lys 


Leu 


Trp 


Tyr 


Gin Leu Glu 


Lys 


vjt-LU 


ir TO 


He Val Gly 


Ala 












455 










\J *±m LI J. 


Phe 


Tyr Val 


Ala 


Gly Ala Ala Asn 


Arg 


Glu 


i nr Lys Leu 


Gly 


I iO J 








470 






475 






480 




Gly Tyr Val 


Thr 


Asn Arg Gly Arg 


Gin 


Lys 


vai val Tnr 


Leu 








485 






A O f\ 

490 






495 






Thr 


Thr 


Asn 


Gin 


Lys Thr Ala 


Leu 


Gin 


Ala 


lie Tyr Leu 


Ala 






500 






r? f\ tr 

505 








510 




LreU Ijin 


Asp 


Ser Gly Leu 


Glu Val Asn 


lie 


Val 


Thr 


Ala Ser Gin 


Tyr 




515 








520 








r- r~i r- 

525 


Aia Jjeu 


Gly 


lie 


He 


Gin 


Ala Gin Pro 


Asp 


Gin 


Ser 


Glu Ser Glu 


Leu 


DJ U 










535 




540 






Val ASn 


Gin 


He 


He 


Glu 


Gin Leu He 


Lys 


Lys 


Glu 


Lys Val Tyr 


Leu 










550 






555 




560 


Aia i rp 


Val 


Pro 


Ala 


His 


Lys Gly He 


Gly 


Gly Asn 


Glu Gin Val 


Asp 








565 






570 






575 


L»ys beu 


Val 


Ser 


Ala 


Gly 


He Arg Lys 


Val 


Leu 


Phe 


Leu Asp Gly 


He 






580 






585 








590 




Asp Liys 


Ala 


Gin 


Asp 


Glu 


His Glu Lys 


Tyr 


His 


Ser 


Asn Trp Arg 


Ala 




595 








600 








605 






Ser Asp 


Phe 


Asn 


Leu Pro Pro 


Val 


Val 


Ala 


Lys Glu He 


Val 


Oil) 










615. 






620 




Aia oer 


Cys 


Asp 


Lys 


Cys 


Gin Leu Lys 


Gly 


Glu 


Ala 


Met His Gly 


Gin 










630 






635 




640 


Tr a 1 Ann 

VrtJ. ASp 


Cys 


Ser 


Pro Gly 


He Trp Gin 


Leu 


Ala 


Cys 


Thr His Leu 


Glu 








645 






650 






655 




vsXy Liys 


Val 


He 


Leu 


Val 


Ala Val His 


Val 


Ala Ser Gly Tyr He 


Glu 






660 






665 








670 




Aia VjlU 


Val 


lie 


Pro Ala 


Glu Thr Gly Gin 


Glu 


Thr 


Ala Tyr Phe 


Leu 




675 








680 








685 




Leu Liys 


Leu Ala Gly Arg 


Trp Pro Val 


Lys 


Thr 


He 


His Thr Ala 


Asn 


con 










695 




700 






f^l \r Cav* 

*aiy oci 


Asn 


Phe Thr Gly 


Ala Thr Val 


Arg 


Ala 


Ala 


Cys Trp Trp 


Ala 


"7 n c 








710 






715 




720 


r*l ^t- Tl d 
\j±y lie 


Lys 


Gin 


Glu 


Phe 


Gly He Pro 


Tyr 


Asn 


Pro 


Gin Ser Gin 


Gly 








725 






730 






735 


Val vai 


Ala 


Ser 


Met 


Asn 


Lys Glu Leu 


Lys 


Lys 


He 


He Gly Gin 


Val 






740 






745 








750 




Arg Asp 


Gin 


Ala 


Glu 


His 


Leu Lys Thr 


Ala 


Val 


Gin 


Met Ala Val 


Phe 


lie rllS 


755 








760 








765 




Asn 


Phe 


Lys 


Arg 


Lys Gly . Gly He 


Gly Gly Tyr Ser Ala 


Gly 


H 1 f\ 










775 






780 




Glu Airg 


lie 


Val 


Asp 


He 


He Ala Thr 


Asp 


He 


Gin 


Thr Lys Glu 


Leu 


785 








790 






795 




800 


Gin Lys 


Gin 


lie 


Thr 


Lys 


He Gin Asn 


Phe 


Arg 


Val 


Tyr Tyr Arg 


Asp 








805 






810 






815 


Ser Arg 


Asn 


Pro 


Leu 


Trp 


Lys Gly Pro 


Ala 


Lys 


Leu 


Leu Trp Lys 


Gly 






820 






825 








830 


Glu Gly 


Ala 


Val 


Val 


He 


Gin Asp Asn 


Ser 


Asp 


He 


Lys Val Val 


Pro 




835 








840 








845' 




Arg Arg 


Lys 


Ala 


Lys 


He 


He Arg Asp Tyr 


Gly Lys 


Gin Met Ala 


Gly 


850 










855 






860 




Asp Asp 


Cys 


Val 


Ala 


Ser 


Arg Gin Asp Glu 


Asp 








865 








870 






875 









<210> 9 
<211> 4945 
<212> DNA 

<213> E. coli (VlJns-tpa) 
<400> 9 

tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 
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cagcttgtct 
ttggcgggtg 
accatatgcg 
ctattggcca 
tccaacatta 
ggggtcatta 
cccgcctggc 
catagtaacg 
tgcccacttg 
tgacggtaaa 
ttggcagtac 
catcaatggg 
cgtcaatggg 
ctccgcccca 
agctcgttta 
tagaagacac 
tccccgtgcc 
tcttafcgcat 
taggtgatgg 
tattggtgac 
tattggctat 
ggatggggtc 
cgcagttttt 
catgggctct 
agcggctcat 
agcacaatgc 
gaaaatgagc 
gcagaagaag 
gttgcggtgc 
cgcgccacca 
tgcagtcacc 
ctgctgtgtg 
gccatctgtt 
tgtcctttcc 
tctggggggt 
tgctggggat 
tcctcctggg 
ccctggttct 
tcaatcccac 
caaacctagc 
gagagaaaat 
ttcctcgctc 
ctcaaaggcg 
agcaaaaggc 
taggctccgc 
cccgacagga 
tgttccgacc 
gctttctcat 
gggctgtgtg 
tcttgagtcc 
gattagcaga 
cggctacact 
aaaaagagtt 
tgtttgcaag 
ttctacgggg 
attatcaaaa 
ctaaagtata 
tatctcagcg 
ctgaggfcctg 
atccagccag 
ggtgattttg 
ctgatccttc 



gtaagcggat 
tcggggctgg 
gtgtgaaata 
ttgcatacgt 
ccgccatgtt 
gttcatagcc 
tgaccgccca 
ccaataggga 
gcagtacatc 
tggcccgcct 
atctacgtat 
cgtggatagc 
agtttgtttt 
ttgacgcaaa 
gtgaaccgtc 
cgggaccgat 
aagagtgacg 
gctatactgt 
tatagcttag 
gatactttcc 
atgccaatac 
ccatttatta 
attaaacata 
tctccggtag 
ggtcgctcgg 
ccaccaccac 
gtggagattg 
atgcaggcag 
tgttaacggt 
gacataatag 
gtccttagat 
gagcagtctt 
gtttgcccct 
taataaaatg 
ggggtggggc 
gcggtgggct 
ccagaaagaa 
tagttccagc 
ccgctaaagt 
ctccaagagt 
gcctccaaca 
actgactcgc 
gtaatacggt 
cagcaaaagg 
ccccctgacg 
ctataaagat 
ctgccgctta 
agctcacgct 
cacgaacccc 
aacccggtaa 
gcgaggtatg 
agaagaacag 
ggtagctctt 
cagcagatta 
tctgacgctc 
aggatcttca 
tatgagtaaa 
atctgtctat 
cctcgtgaag 
aaagtgaggg 
aacttttgct 
aactcagcaa 



gccgggagca 
cttaactatg 
ccgcacagat 
tgtatccata 
gacattgatt 
catatatgga 
acgacccccg 
ctttccattg 
aagtgtatca 
ggcattatgc 
tagtcatcgc 
ggtttgactc 
ggcaccaaaa 
tgggcggtag 
agatcgcctg 
ccagcctccg 
taagtaccgc 
ttttggcttg 
cctataggtg 
attactaatc 
tctgtccttc 
tttacaaatt 
gcgtgggatc 
cggcggagct 
cagctccttg 
cagtgtgccg 
ggctcgcacg 
ctgagttgtt 
ggagggcagt 
ctgacagact 
caccatggat 
cgtttcgccc 
cccccgtgcc 
aggaaattgc 
aggacagcaa 
ctatggccgc 
gcaggcacat 
cccactcata 
acttggagcg 
gggaagaaat 
. tgtgaggaag 
tgcgctcggt 
tatccacaga 
ccaggaaccg 
agcatcacaa 
accaggcgtt 
ccggatacct 
gtaggtatct 
ccgttcagcc 
gacacgactt 
taggcggtgc 
tatttggtat 
gatccggcaa 
cgcgcagaaa 
agtggaacga 
cctagatcct 
cttggtctga 
ttcgttcatc 
aaggtgttgc 
agccacggtt 
ttgccacgga 
aagttcgatt 



gacaagcccg 
cggcatcaga 
gcgtaaggag 
tcataatatg 
attgactagt 
gttccgcgtt 
cccattgacg 
acgtcaatgg 
tatgccaagt 
ccagtacatg 
tattaccatg 
acggggattt 
tcaacgggac 
gcgtgtacgg 
gagacgccat 
cggccgggaa 
ctatagactc 
gggcctatac 
tgggttattg. 
cataacatgg 
agagactgac 
cacatataca 
tccacgcgaa 
tccacatccg 
ctcctaacag 
cacaaggccg 
gctgacgcag 
gtattctgat 
gtagtctgag 
aacagactgt 
gcaatgaaga 
agcgagatct 
ttccttgacc 
atcgcattgt 

gggggaggat 

tgcggccagg 
ccccttctct 
ggacactcat 
gtctctccct 
taaagcaaga 
taatgagaga 
cgttcggctg 
atcaggggat 
taaaaaggcc 
aaatcgacgc 
tccccctgga 
gtccgccttt 
cagttcggtg 
cgaccgctgc 
atcgccactg 
tacagagttc 
ctgcgcfcctg 
acaaaccacc 
aaaaggatct 
aaactcacgt 
tttaaattaa 
cagttaccaa 
catagttgcc 
tgactcatac 
gatgagagct 
acggtctgcg 
tattcaacaa 



tcagggcgcg 
gcagattgta 
aaaataccgc 
tacatttata 
tattaatagt 
acataactta 
tcaataatga 
gtggagtatt 
acgcccccta 
accttatggg 
gtgatgcggt 
ccaagtctcc 
tttccaaaat 
tgggaggtct 
ccacgctgtt 
cggtgcattg 
tataggcaca 
acccccgctt 
accattattg 
ctctttgcca 
acggactctg 
acaacgccgt 
tctcgggtac 
agccctggtc 
tggaggccag 
tggcggtagg 
atggaagact 
aagagtcaga 
cagtactcgt 
tcctttccat 



gagggctctg 
gctgtgcctt 
ctggaaggtg 
ctgagtaggt 
tgggaagaca 
tgctgaagaa 
gtgacacacc 
agctcaggag 
ccctcatcag 
taggctatta 
aatcatagaa 
cggcgagcgg 
aacgcaggaa 
gcgttgctgg 
tcaagtcaga 
agctccctcg 
ctcccttcgg 
taggtcgttc 
gccttatccg 



gcagcagcca 
ttgaagtggt 
ctgaagccag 
gctggtagcg 
caagaagatc 
taagggattt 
aaatgaagtt 
tgcttaatca 
tgactcgggg 
caggcctgaa 
ttgttgtagg 
ttgtcgggaa 
agccgccgtc 



tcagcgggtg 
ctgagagtgc 
atcagattgg 
ttggctcatg 
aatcaattac 
cggtaaatgg 
cgtatgttcc 
tacggtaaac 
ttgacgtcaa 
actttcctac 
tttggcagta 
accccattga 
gtcgtaacaa 
atataagcag 
ttgacctcca 
gaacgcggat 
cccctttggc 
ccttatgcta 
accactcccc 
caactatctc 
tatttttaca 
cccccgtgcc 
gtgttccgga 
ccatgcctcc 
acttaggcac" 
gtatgtgtct 
taaggcagcg 
ggtaactccc 
tgctgccgcg 

gggtcttttc • 

ctgtgtgctg 

ctagttgcca 

ccactcccac 

gtcattctat 

atagcaggca 

ttgacccggt 

ctgtccacgc 

ggctccgcct 

cccaccaaac 

agtgcagagg 

tttcttccgc 

tatcagctca 

agaacatgtg 

cgtttttcca 

ggtggcgaaa 

tgcgctctcc 

gaagcgtggc 

gctccaagct 

gtaactatcg . 

ctggtaacag 

ggcctaacta 

ttaccttcgg 

gtggtttttt 

ctttgatctt 

tggtcatgag 

ttaaatcaat 

gtgaggcacc 

ggggggggcg 

tcgccccatc 
tggaccagtt 
gatgcgtgat 
ccgtcaagtc 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
. 1620 
1680 
1740 
1800 
1860 
1920. 
1980- 
2040 
2100- 
2160 
222 0 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
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agcgtaatgc 

agcatcaaat 

agccgtttct 

tggtatcggt 

tcaaaaataa 

ggcaaaagct 

tcaaaatcac 

aatacgcgat 

aacactgcca 

aatgctgttt 

aaatgcttga 

tctgtaacat 

ggcttcccat 

ttatacccat 

tcccgttgaa 

attgttcatg 

acgtggcttt 

ggatacatat 

cgaaaagtgc 

aggcgtatca 



tctgccagtg 

gaaactgcaa 

gtaatgaagg 

ctgcgattcc 

ggttatcaag 

tatgcatttc 

tcgcatcaac 

cgctgttaaa 

gcgcatcaac 

tcccggggat 

tggtcggaag 

cattggcaac 

acaatcgata 

ataaatcagc 

tatggctcat 

atgatatatt 

cccccccccc 

ttgaatgtat 

cacctgacgt 

cgaggccctt 



ttacaaccaa 

tttattcata 

agaaaactca 

gactcgtcca 

tgagaaatca 

tttccagact 

caaaccgtta 

aggacaatta 

aatattttca 

cgcagtggtg 

aggcataaat 

gctacctttg 

gattgtcgca 

atccatgttg 

aacacccctt 

tttatcttgt 

ccattattga 

ttagaaaaat 

ctaagaaacc 

tcgtc 



ttaaccaatt 

tcaggattat 

ccgaggcagt 

acatcaatac 

ccatgagtga 

tgttcaacag 

ttcattcgtg 

caaacaggaa 

cctgaatcag 

agtaaccatg 

tccgtcagcc 

ccatgtttca 

cctgattgcc 

gaatttaatc 

gtattactgt 

gcaatgtaac 

agcatttatc 

aaacaaatag 

attattatca 



ctgattagaa 

caataccata 

tccataggat 

aacctattaa 

cgactgaatc 

gccagccatt 

attgcgcctg 

tcgaatgcaa 

gatattcttc 

catcatcagg 

agtttagtct 

gaaacaactc 

cgacattatc 

gcggcctcga 

ttatgtaagc 

atcagagatt 

agggttattg 

gggttccgcg 

tgacattaac 



aaactcatcg 

tttttgaaaa 

ggcaagatcc 

tttcccctcg 

cggtgagaat 

acgctcgtca 

agcgagacga 

ccggcgcagg 

taatacctgg 

agtacggata 

gaccatctca 

tggcgcatcg 

gcgagcccat 

gcaagacgtt 

agacagtttt 

ttgagacaca 

tctcatgagc 

cacatttccc 

ctataaaaat 



3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4740 

4800 

4860 

4920 

4945 



<210> 10 
<211> 23 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> oligonucleotide 
<400> 10 

ctatataagc agagctcgtt tag 

<210> 11 
<211> 30 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> oligonucleotide 
<400> 11 

gtagcaaaga tctaaggacg gtgactgcag 

<210> 12 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> oligonucleotide 

<400> 12 . 
gtatgtgtct gaaaatgagc gtggagattg ggctcgcac 

<210> 13 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> oligonucleotide 



23 



30 



39 
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<400> 13 

gtgcgagccc aatctccacg ctcattttca gacacatac 

<210> 14 
<211> 4432 
<212> DNA 

<213> E. coli {V1J plasmid) 



39 



<400> 14 
tcgcgcgttt 
cagcttgtct 
ttggcgggtg 
accatatgcg 
ctattggcca 
tccaacatta 

ggggtcatta 

cccgcctggc 
catagtaacg 
tgcccacttg 
tgacggtaaa 
ttggcagtac 
catcaatggg 
cgtcaatggg 
ctccgcccca 
agctcgttta 
tagaagacac 
tccccgtgcc 
ttcttatgca 
ataggtgatg 
ctattggtga 
ttattggcta 
aggatggggt 

ccgcagtttt 
acatgggctc 
cagcgactca 
cagcacgatg 
tgaaaatgag 
ggcagaagaa 
cgttgcggtg 
gcgcgccacc 
ctgcagtcac 
cctcccccgt 
atgaggaaat 
ggcagcacag 
gctctatggg 
aggcacatcc 
cactcatagg 
ttggagcggt 
gaagaaatta 
tgaggaagta 
cgctcggtcg 
tccacagaat 
aggaaccgta 
catcacaaaa 
caggcgtttc 
ggatacctgt 
aggtatctca 
gttcagcccg 
cacgacttat 
ggcggtgcta 
tttggtatct 
tccggcaaac 



cggtgatgac 
gtaagcggat 
tcggggctgg 
gtgtgaaata 
ttgcatacgt 
ccgccatgtt 
gttcatagcc 
tgaccgccca 
ccaataggga 
gcagtacatc 
tggcccgcct 
atctacgtat 
cgtggatagc 
agtttgtttt 
ttgacgcaaa 
gtgaaccgtc 
cgggaccgat 
aagagtgacg 
tgctatactg 
gtatagctta 
cgatactttc 
tatgccaata 
ctcatttatt 
tattaaacat 
ttctccggta 
tggtcgctcg 
cccaccacca 
ctcggggagc 
gatgcaggca 
ctgttaacgg 
agacataata 
cgtccttaga 
gccttccttg 
tgcatcgcat 
caagggggag 
tacccaggtg 
ccttctctgt 
acactcatag 
ctctccctcc 
aagcaagata 
atgagagaaa 
ttcggctgcg 
caggggataa 
aaaaggccgc 
atcgacgctc 
cccctggaag 
ccgcctttct 
gttcggtgta 
accgctgcgc 
cgccactggc 

cagagttctt 
gcgctctgct 
aaaccaccgc 



ggtgaaaacc 
gccgggagca 
cttaactatg 
ccgcacagat 
tgtatccata 
gacattgatt 
catatatgga 
acgacccccg 
ctttccattg 
aagtgtatca 
ggcattatgc 
tagtcatcgc 
ggtttgactc 
ggcaccaaaa 
tgggcggtag 
agatcgcctg 
ccagcctccg 
taagtaccgc 
tttttggctt 
gcctataggt 
cattactaat 
cactgtcctt 
atttacaaat 
aacgtgggat 
gcggcggagc 
gcagctccfct 
ccagtgtgcc 
gggcttgcac 
gctgagttgt 
tggagggcag 
gctgacagac 
tctgctgtgc 
accctggaag 
tgtctgagta 
gattgggaag 
ctgaagaatt 
gacacaccct 
ctcaggaggg 
ctcatcagcc 
ggctattaag 
tcatagaatt 
gcgagcggta 
cgcaggaaag 
gttgctggcg 
aagtcagagg 
ctccctcgtg 
cccttcggga 
ggtcgttcgc 
cttatccggt 
agcagccact 
gaagtggtgg 
gaagccagtt 
tggtagcggt 



tctgacacat 
gacaagcccg 
cggcatcaga 
gcgtaaggag 
tcataatatg 
attgactagt 
gttccgcgtt 
cccattgacg 
acgtcaatgg 
tatgccaagt 
ccagtacatg 
tattaccatg 
acggggattt 
tcaacgggac 
gcgtgtacgg 
gagacgccat 
cggccgggaa 
ctatagagtc 
ggggtctata 
gtgggttatt 
ccataacatg 
cagagactga 
tcacatatac 
ctccacgcga 
ttctacatcc 
gctcctaaca 
gcacaaggcc 
cgctgacgca 
tgtgttctga 
tgtagtctga 
taacagactg 
cttctagttg 
gtgccactcc 
ggtgtcattc 
acaatagcag 
gacccggttc 
gtccacgccc 
ctccgccttc 
caccaaacca 
tgcagaggga 
tcttccgctt 
tcagctcact 
aacatgtgag 
tttttccata 
tggcgaaacc 
cgctctcctg 
agcgtggcgc 
tccaagctgg 
aactatcgtc 
ggtaacagga 
cctaactacg 
accttcggaa 
ggtttttttg 



gcagctcccg 
tcagggcgcg 
gcagattgta 
aaaataccgc 
tacatttata 
tattaatagt 
acataactta 
tcaataatga 
gtggagtatt 
acgcccccta 
accttatggg 

gtgatgcggt 
ccaagtctcc 
tttccaaaat 
tgggaggtct 
ccacgctgtt 
cggtgcattg 
tataggccca 
cacccccgct 
gaccattatt 
gctctttgcc 
cacggactct 
aacaccaccg 
atctcgggta 
gagccctgct 
gtggaggcca 
gtggcggtag 
tttggaagac 
taagagtcag 
gcagtactcg 
ttcctttcca 
ccagccatct 
cactgtcctt 
tattctgggg 
gcatgctggg 
ctcctgggcc 
ctggttctta 
aatcccaccc 
aacctagcct 
gagaaaatgc 
cctcgctcac 
caaaggcggt 
caaaaggcca 
ggctccgccc 
cgacaggact 
ttccgaccct 
tttctcaatg 
gctgtgtgca 
ttgagtccaa 
ttagcagagc 
gctacactag 
aaagagttgg 
tttgcaagca 



gagacggtca 
tcagcgggtg 
ctgagagtgc 
atcagattgg 
ttggctcatg. 
aatcaattac 
cggtaaatgg 
cgtatgttcc 
tacggtaaac 
ttgacgtcaa 
actttcctac 
tttggcagta 
accccattga 
gtcgtaacaa 
atataagcag 
ttgacctcca 
gaacgcggat 
cccccttggc 
tcctcatgtt 
gaccactccc 
acaactctct 
gtatttttac 
tccccagtgc 
cgtgttccgg 
cccatgcctc 
gacttaggca 
ggtatgtgtc 
ttaaggcagc 
aggtaactcc 
ttgctgccgc 
tgggtctttt 
gttgtttgcc 

tcctaataaa 
Sgtggggtgg 

gatgcggtgg 
agaaagaagc 
gttccagccc 
gctaaagtac 
ccaagagtgg 
ctccaacatg 
tgactcgctg 
aatacggtta 
gcaaaaggcc 
ccctgacgag 
ataaagatac 
gccgcttacc 
ctcacgctgt 
cgaacccccc 
cccggtaaga 
gaggtatgta 
aaggacagta 
tagctcttga 
gcagattacg 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
. 900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 . 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
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cgcagaaaaa aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag 3240 

tggaacgaaa actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc 3300 

tagatccttt taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact 3360 

tggtctgaca gttaccaatg cttaatcagt gaggcaccta tctcagcgat ctgtctattt 3420 

cgttcatcca tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta 3480 

ccatctggcc ccagtgctgc aatgataccg cgagacccac gctcaccggc tccagattta 3540 

tcagcaataa accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc 3600 
gcctccatcc agtctattaa ttgttgccgg gaagctagag taagtagttc gccagttaat " 3660 

agtttgcgca acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt 372 0 

atggcttcat tcagctccgg ttcccaacga tcaaggcgag ttacatgatc ccccatgttg 3780 

tgcaaaaaag cggttagctc cttcggtcct ccgatcgttg tcagaagtaa gttggccgca 3840 

gtgttatcac tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta 3900 

agatgctttt ctgtgactgg tgagtactca accaagtcat tctgagaata gtgtatgcgg 3960 

cgaccgagtt gctcttgccc ggcgtcaata cgggataata ccgcgccaca tagcagaact 4020 

ttaaaagtgc tcatcattgg aaaacgttct tcggggcgaa aactctcaag gatcttaccg 4080 

ctgttgagat ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt 4140 

actttcacca gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga 4200 

ataagggcga cacggaaatg ttgaatactc atactcttcc tttttcaata ttattgaagc 4260 

atttatcagg gttattgtct catgagcgga tacatatttg aatgtattta gaaaaataaa 4320 

caaatagggg ttccgcgcac atttccccga aaagtgccac ctgacgtcta agaaaccatt 4380 

attatcatga cattaaccta taaaaatagg cgtatcacga ggccctttcg tc 4432 

<210> 15 
<211> 4864 
<212> DNA 

<213> E. coli (VlJneo plasmid) 
<400> 15 

tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 

cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120 

ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180 

accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcagattgg 240 

ctattggcca ttgcatacgt tgtatccata tcataatatg tacatttata ttggctcatg 300 

tccaacatta ccgccatgtt gacattgatt attgactagt tattaatagt aatcaattac 360 

ggggtcatta gttcatagcc catatatgga gttccgcgtt acataactta cggtaaatgg 420 

cccgcctggc tgaccgccca acgacccccg cccattgacg tcaataatga cgtatgttcc 480 

catagtaacg ccaataggga ctttccattg acgtcaatgg gtggagtatt tacggtaaac 540 

tgcccacttg gcagtacatc aagtgtatca tatgccaagt acgcccccta ttgacgtcaa 600 

tgacggtaaa tggcccgcct ggcattatgc ccagtacatg accttatggg actttcctac 660 

ttggcagtac atctacgtat tagtcatcgc tattaccatg gtgatgcggt tttggcagta 720 

catcaatggg cgtggatagc ggtttgactc acggggattt ccaagtctcc accccattga 780 

cgtcaatggg agtttgtttt ggcaccaaaa tcaacgggac tttccaaaat gtcgtaacaa 840 

ctccgcccca ttgacgcaaa tgggcggtag gcgtgtacgg tgggaggtct atataagcag 900 

agctcgttta gtgaaccgtc agatcgcctg gagacgccat ccacgctgtt ttgacctcca 960 

tagaagacac cgggaccgat ccagcctccg cggccgggaa cggtgcattg gaacgcggat 102 0 

tccccgtgcc aagagtgacg taagtaccgc ctatagagtc tataggccca cccccttggc 1080 

ttcttatgca tgctatactg tttttggctt ggggtctata cacccccgct tcctcatgtt 1140 

ataggtgatg gtatagctta gcctataggt gtgggttatt gaccattatt gaccactccc 1200 

ctattggtga cgatactttc cattactaat ccataacatg gctctttgcc acaactctct 12 60 

ttattggcta tatgccaata cactgtcctt cagagactga cacggactct gtatttttac 1320 

aggatggggt ctcatttatt atttacaaat tcacatatac aacaccaccg tccccagtgc 1380 

ccgcagtttt tattaaacat aacgtgggat ctccacgcga atctcgggta cgtgttccgg 1440 

acatgggctc ttctccggta gcggcggagc ttctacatcc gagccctgct cccatgcctc 1500 

cagcgactca tggtcgctcg gcagctcctt gctcctaaca gtggaggcca gacttaggca 1560 

cagcacgatg cccaccacca ccagtgtgcc gcacaaggcc gtggcggtag ggtatgtgtc 1620 

tgaaaatgag ctcggggagc gggcttgcac cgctgacgca tttggaagac ttaaggcagc 1680 

ggcagaagaa gatgcaggca gctgagttgt tgtgttctga taagagtcag aggtaactcc 1740 

cgttgcggtg ctgttaacgg tggagggcag tgtagtctga gcagtactcg ttgctgccgc 1800 

gcgcgccacc agacataata gctgacagac taacagactg ttcctttcca tgggtctttt 1860 

ctgcagtcac cgtccttaga tctgctgtgc cttctagttg ccagccatct gttgtttgcc 192 0 

cctcccccgt gccttccttg accctggaag gtgccactcc cactgtcctt tcctaataaa 1980 

atgaggaaat tgcatcgcat tgtctgagta ggtgtcattc tattctgggg ggtggggtgg 2040 
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ggcagcacag 

gctctatggg 

aggcacatcc 
cactcatagg 
ttggagcggt 
gaagaaatta 
tgaggaagta 
cgctcggtcg 
tccacagaat 
aggaaccgta 
catcacaaaa 
caggcgtttc 
ggatacctgt 
aggtatctca 
gttcagcccg 
cacgacttat 
ggcggtgcta 
tttggtatct 
tccggcaaac 
cgcagaaaaa 
tggaacgaaa 
tagatccttt 
tggtctgaca 
cgttcatcca 
aggtgttgct 
gccacggttg 
tgccacggaa 
agttcgattt 
tacaaccaat 
ttattcatat 
gaaaactcac 
actcgtccaa 
gagaaatcac 
ttccagactt 
aaaccgttat 
ggacaattac 
atattttcac 
gcagtggtga 
ggcataaatt 
ctacctttgc 
attgtcgcac 
tccatgttgg, 
acaccccttg 
ttatcttgtg 
cattattgaa 
tagaaaaata 
taagaaacca 
cgtc 



caagggggag 
tacccaggtg 
ccttctctgt 
acactcatag 
ctctccctcc 



aagcaagata 
atgagagaaa 
ttcggctgcg 
caggggataa 
aaaaggccgc 
atcgacgctc 
cccctggaag 
ccgcctttct 
gttcggtgta 
accgctgcgc 
cgccactggc 
cagagttctt 
gcgctctgct 
aaaccaccgc 
aaggatctca 
actcacgtta 
taaattaaaa 
gttaccaatg 
tagttgcctg 
gactcatacc 
atgagagctt 
cggtctgcgt 
attcaacaaa 
taaccaattc 
caggattatc 
cgaggcagtt 
catcaataca 
catgagtgac 
gttcaacagg 
tcattcgtga 
aaacaggaat 
ctgaatcagg 
gtaaccatgc 
ccgtcagcca 
catgtttcag 
ctgattgccc 
aatttaatcg 
tattactgtt 
caatgtaaca 
gcatttatca 
aacaaatagg 
ttattatcat 



gattgggaag 
ctgaagaatt 
gacacaccct 
ctcaggaggg 
ctcatcagcc 
ggctattaag 
tcatagaatt 
gcgagcggta 
cgcaggaaag 
gttgctggcg 
aagtcagagg 
ctccctcgtg 
cccttcggga 
ggtcgttcgc 
cttatccggt 
agcagccact 
gaagtggtgg 
gaagccagtt 
tggtagcggt 
agaagatcct 
agggattttg 
atgaagtttt 
cttaatcagt 
actccggggg 
aggcctgaat 
tgttgtaggt 
tgtcgggaag 
gccgccgtcc 
tgattagaaa 
aataccatat 
ccataggatg 
acctattaat 
gactgaatcc 
ccagccatta 
ttgcgcctga 
cgaatgcaac 
atattcttct 
atcatcagga 
gtttagtctg 
aaacaactct 
gacattatcg 
cggcctcgag 
tatgtaagca 
tcagagattt 
gggttattgt 
ggttccgcgc 
gacattaacc 



acaatagcag 
gacccggttc 
gtccacgccc 
ctccgccttc 
caccaaacca 
tgcagaggga 
tcttccgctt 
tcagctcact 
aacatgtgag 
tttttccata 
tggcgaaacc 
cgctctcctg 
agcgtggcgc 
tccaagctgg 
aactatcgtc 
ggtaacagga 
cctaactacg 
accttcggaa 
ggtttttttg 
ttgatctttt 
gtcatgagat 
aaatcaatct 
gaggcaccta 

gggggggcgc 

cgccccatca 
ggaccagttg 
atgcgtgatc 
cgtcaagtca 
aactcatcga 
ttttgaaaaa 
gcaagatcct 
ttcccctcgt 
ggtgagaatg 
cgctcgtcat 
gcgagacgaa 
cggcgcagga 
aatacctgga 
gtacggataa 
accatctcat 
ggcgcatcgg 
cgagcccatt 
caagacgttt 
gacagtttta 
tgagacacaa 
ctcatgagcg 
acatttcccc 
tataaaaata 



gcatgctggg 
ctcctgggcc 
ctggttctta 
aatcccaccc 
aacctagcct 
gagaaaatgc 
cctcgctcac 
caaaggcggt 
caaaaggcca 
ggctccgccc 
cgacaggact 
ttccgaccct 
tttctcaatg 
gctgtgtgca 
ttgagtccaa 
ttagcagagc 
gctacactag 
aaagagttgg 
tttgcaagca 
ctacggggtc 
tatcaaaaag 
aaagtatata 
tctcagcgat 
tgaggtctgc 
tccagccaga 
gtgattttga 
tgatccttca 
gcgtaatgct 
gcatcaaatg 
gccgtttctg 
ggtatcggtc 
caaaaataag 
gcaaaagctt 
caaaatcact 
atacgcgatc 
acactgccag 
atgctgtttt 
aatgcttgat 
ctgtaacatc 
gcttcccata 
tatacccata 
cccgttgaat 
ttgttcatga 
cgtggctttc 
gatacatatt 
gaaaagtgcc 
ggcgtatcac 



gatgcggtgg 
agaaagaagc 
gttccagccc 
gctaaagtac 
ccaagagtgg 
ctccaacatg 
tgactcgctg 
aatacggtta 
gcaaaaggcc 
ccctgacgag 
ataaagatac 
gccgcttacc 
ctcacgctgt 
cgaacccccc 
cccggtaaga 
gaggtatgta 
aaggacagta 
tagctcttga 
gcagattacg 
tgacgctcag 
gatcttcacc 
tgagtaaact 
ctgtctattt 
ctcgtgaaga 
aagtgaggga 
acttttgctt 
actcagcaaa 
ctgccagtgt 
aaactgcaat 
taatgaagga 
tgcgattccg 
gttatcaagt - 
atgcatttct 
cgcatcaacc 
gctgttaaaa 
cgcatcaaca 
cccggggatc 
ggtcggaaga 
attggcaacg 
caatcgatag 
taaatcagca 
atggctcata 
tgatatattt 
cccccccccc 
tgaatgtatt 
acctgacgtc 
gaggcccttt 



<210> 16 
<211> 4867 
<212> DNA 

<213> E. coli (VUns plasmid) 



<400> 16 

tcgcgcgttt 

cagcttgtct 

ttggcgggtg 

accatatgcg 

ctattggcca 

tccaacatta 

ggggtcatta 



cggtgatgac 
gtaagcggat 
tcggggctgg 
gtgtgaaata 
ttgcatacgt 
ccgccatgtt 
gttcatagcc 



ggtgaaaacc 
gccgggagca 
cttaactatg 
ccgcacagat 
tgtatccata 
gacattgatt 
catatatgga 



2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
• 3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4864 



tctgacacat 
gacaagcccg 
cggcatcaga 
gcgtaaggag 
tcataatatg 
attgactagt 
gttccgcgtt 



gcagctcccg 
tcagggcgcg 
gcagattgta 
aaaataccgc 
tacatttata 
tattaatagt 
acataactta 



gagacggtca 
tcagcgggtg 
ctgagagtgc 
atcagattgg 
ttggctcatg 
aatcaattac 
cggtaaatgg 



60 
120 
180 
240 
300 
360 
420 
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cccgcctggc tgaccgccca acgacccccg cccattgacg tcaataatga cgtatgttcc 480 
catagtaacg ccaataggga ctttccattg acgtcaatgg gtggagtatt tacggtaaac 540 
tgcccacttg gcagtacatc aagtgtatca tatgccaagt acgcccccta ttgacgtcaa 600 
tgacggtaaa tggcccgcct ggcattatgc ccagtacatg accttatggg actttcctac 660 
ttggcagtac atctacgtat tagtcatcgc tattaccatg gtgatgcggt tttggcagta 720 
catcaatggg cgtggatagc ggtttgactc acggggattt ccaagtctcc accccattga 780 
cgtcaatggg agtttgtttt ggcaccaaaa tcaacgggac tttccaaaat gtcgtaacaa 840 
ctccgcccca ttgacgcaaa tgggcggtag gcgtgtacgg tgggaggtct atataagcag 900 
agctcgttta gtgaaccgtc agatcgcctg gagacgccat ccacgctgtt ttgacctcca 960 

tagaagacac cgggaccgat ccagcctccg cggccgggaa cggtgcattg gaacgcggat 1020 

tccccgtgcc aagagtgacg taagtaccgc ctatagactc tataggcaca cccctttggc 1080 

tcttatgcat gctatactgt ttttggcttg gggcctatac acccccgctt ccttatgcta 1140 

taggtgatgg tatagcttag cctataggtg tgggttattg accattattg accactcccc 12 00 

tattggtgac gatactttcc attactaatc cataacatgg ctctttgcca caactatctc 12 60 

tattggctat atgccaatac tctgtccttc agagactgac acggactctg tatttttaca 1320 

ggatggggtc ccatttatta tttacaaatt cacatataca acaacgccgt cccccgtgcc 1380 

cgcagttttt attaaacata gcgtgggatc tccacgcgaa tctcgggtac gtgttccgga 1440 

catgggctct tctccggtag cggcggagct tccacatccg agccctggtc ccatgcctcc 1500 

agcggctcat ggtcgctcgg cagctccttg ctcctaacag tggaggccag acttaggcac 1560 

agcacaatgc ccaccaccac cagtgtgccg cacaaggccg tggcggtagg gtatgtgtct 1620 

gaaaatgagc gtggagattg ggctcgcacg gctgacgcag atggaagact taaggcagcg 1680 

gcagaagaag atgcaggcag ctgagttgtt gtattctgat aagagtcaga ggtaactccc 1740 

gttgcggtgc tgttaacggt ggagggcagt gtagtctgag cagtactcgt tgctgccgcg 1800 

cgcgccacca gacataatag ctgacagact aacagactgt tcctttccat gggtcttttc 1860 

tgcagtcacc gtccttagat ctgctgtgcc ttctagttgc cagccatctg ttgtttgccc 1920 

ctcccccgtg ccttccttga ccctggaagg tgccactccc actgtccttt cctaataaaa 1980 

tgaggaaatt gcatcgcatt gtctgagtag gtgtcattct attctggggg gtggggtggg 2040 

gcaggacagc aagggggagg attgggaaga caatagcagg catgctgggg atgcggtggg 2100 

ctctatggcc gctgcggcca ggtgctgaag aattgacccg gttcctcctg ggccagaaag 2160 

aagcaggcac atccccttct ctgtgacaca ccctgtccac gcccctggtt cttagttcca 2220 

gccccactca taggacactc atagctcagg agggctccgc cttcaatccc acccgctaaa 2280 

gtacttggag cggtctctcc ctccctcatc agcccaccaa accaaaccta gcctccaaga ' 2340*. 

gtgggaagaa attaaagcaa gataggctat taagtgcaga gggagagaaa atgcctccaa 2400 

catgtgagga agtaatgaga gaaatcatag aatttcttcc gcttcctcgc tcactgactc 2460 

gctgcgctcg gtcgttcggc tgcggcgagc ggtatcagct cactcaaagg cggtaatacg 2 520 

gttatccaca gaatcagggg ataacgcagg aaagaacatg tgagcaaaag gccagcaaaa 2580 

ggccaggaac cgtaaaaagg ccgcgttgct ggcgtttttc cataggctcc gcccccctga 2 6.40 

cgagcatcac aaaaatcgac gctcaagtca gaggtggcga aacccgacag gactataaag 2700 

ataccaggcg tttccccctg gaagctccct cgtgcgctct cctgttccga ccctgccgct 2760 

taccggatac ctgtccgcct ttctcccttc gggaagcgtg gcgctttctc atagctcacg 2820 • 

ctgtaggtat ctcagttcgg tgtaggtcgt tcgctccaag ctgggctgtg tgcacgaacc 2880 

ccccgttcag cccgaccgct gcgccttatc cggtaactat cgtcttgagt ccaacccggt 2940 

aagacacgac ttatcgccac tggcagcagc cactggtaac aggattagca gagcgaggta 3 000 

tgtaggcggt gctacagagt tcttgaagtg gtggcctaac tacggctaca ctagaagaac 3 060 

agtatttggt atctgcgctc tgctgaagcc agttaccttc ggaaaaagag ttggtagctc 3120 

ttgatccggc aaacaaacca ccgctggtag cggtggtttt tttgtttgca agcagcagat 3180 

tacgcgcaga aaaaaaggat ctcaagaaga tcctttgatc ttttctacgg ggtctgacgc 3240 

tcagtggaac gaaaactcac gttaagggat tttggtcatg agattatcaa aaaggatctt 3300 

cacctagatc cttttaaatt aaaaatgaag ttttaaatca atctaaagta tatatgagta 3360 

aacttggtct gacagttacc aatgcttaat cagtgaggca cctatctcag cgatctgtct 3420 

atttcgttca tccatagttg cctgactcgg gggggggggg cgctgaggtc tgcctcgtga 3480 

agaaggtgtt gctgactcat accaggcctg aatcgcccca tcatccagcc agaaagtgag 3540 

ggagccacgg ttgatgagag ctttgttgta ggtggaccag ttggtgattt tgaacttttg 3600 

ctttgccacg gaacggtctg cgttgtcggg aagatgcgtg atctgatcct tcaactcagc 3660 

aaaagttcga tttattcaac aaagccgccg tcccgtcaag tcagcgtaat gctctgccag 3720 

tgttacaacc aattaaccaa ttctgattag aaaaactcat cgagcatcaa atgaaactgc 3780 

aatttattca tatcaggatt atcaatacca tatttttgaa aaagccgttt ctgtaatgaa 3840 

ggagaaaact caccgaggca gttccatagg atggcaagat cctggtatcg gtctgcgatt 3900 

ccgactcgtc caacatcaat acaacctatt aatttcccct cgtcaaaaat aaggttatca 3960 

agtgagaaat caccatgagt gacgactgaa tccggtgaga atggcaaaag cttatgcatt 4020 

tctttccaga cttgttcaac aggccagcca ttacgctcgt catcaaaatc actcgcatca 4080 

accaaaccgt tattcattcg tgattgcgcc tgagcgagac gaaatacgcg atcgctgtta 4140 
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tacaaacagg aatcgaatgc aaccggcgca ggaacactgc cagcgcatca 
acaatatttt cacctgaatc aggatattct tctaatacct ggaatgctgt tttcccaam 
atcgcagtgg tgagtaacca tgcatcatca ggagtacgga ?faaa?gctt gataatcaaf 

acgc?acctt SSfS? cca 9 ttta <* ctgaccatSt catctg?aac S23S5 
acgccacctt tgccatgttt cagaaacaac tctggcgcat caoactterr ah»™»=t.~~. 
tagattgtcg cacctgattg cccgacatta tcglgagccc a???a^accc aStaaJtca 
gcatccatgt tggaatttaa tcgcggcctc gagcaagacg tttcccgttg aatatgactc 
tt^tttctt 2f at ^ t gtttat ^aa gcagacagtt ttattgttS tgaS?ata 
ccccattatt aS? " acatca ?^ ttttgagaca caacgtggct ttcccccccc 
atttaaaaL «?«f 8 tcag ^ttat tgtctcatga gcggatacat atttgaatgt 4 740 
atttagaaaa ataaacaaat aggggttccg cgcacatttc cccgaaaagt gccacctaac innn 
?tt^gtf aa CCattattat -tgacatta acctataaaa ataggcgtlj ?acgagg?cc till 



tttcgtc 

<210> 17 
<211> 75 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> oligonucleotide 
<400> 17 



cgttfcgccc SKS"*' 9 ' a9aga " gCt ct ^^ct g ctgctgtgtg gagcagtctt 



<210> 18 
<211> 78 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> oligonucleotide 



<400> 18 

cttcattS Iccltggt 93 agaCtgCtCC -acagcagc agcacacagc agagccctct 



<210> 19 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> oligonucleotide 
<400> 19 

ggtacaaata ttggctattg gccattgcat acg 

<210> 20 
<21i> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> oligonucleotide 
<400> 20 

ccacatctcg aggaaccggg tcaattcttc agcacc 

<210> 21 
<211> 38 
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4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 



4867 



60 
75 



60 
78 



33 



36 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> oligonucleotide 
<400> 21 

ggtacagata tcggaaagcc acgttgtgtc tcaaaatc 38 

<210> 22 
<211> 36 
<212> DNA 

<213> Artificial Sequence 

« 

<220> 

<223> oligonucleotide 
<400> 22 

cacatggatc cgtaatgctc tgccagtgtt acaacc 36 

<210> 23 
<211> 39 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> oligonucleotide 
<400> 23 

ggtacatgat cacgtagaaa agatcaaagg atcttcttg 39 

<210> 24 
<211> 35 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> oligonucleotide 



<400> 24 

ccacatgtcg acccgtaaaa aggccgcgtt gctgg 

<210> 25 
<211> 4864 
<212> DNA 

<213> E. coli {V1R plasmid) 



35 



<400> 25 

tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 

cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120 

ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180 

accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcagattgg 240 

ctattggcca ttgcatacgt tgtatccata tcataatatg tacatttata ttggctcatg 300 

tccaacatta ccgccatgtt gacattgatt attgactagt tattaatagt aatcaattac 360 

ggggtcatta gttcatagcc catatatgga gttccgcgtt acataactta cggtaaatgg 420 

cccgcctggc tgaccgccca acgacccccg cccattgacg tcaataatga cgtatgttcc 480 

catagtaacg ccaataggga ctttccattg acgtcaatgg gtggagtatt tacggtaaac 540 

tgcccacttg gcagtacatc aagtgtatca tatgccaagt acgcccccta ttgacgtcaa 600 

tgacggtaaa tggcccgcct ggcattatgc ccagtacatg accttatggg actttcctac 660 

ttggcagtac atctacgtat tagtcatcgc tattaccatg gtgatgcggt tttggcagta 720 

catcaatggg cgtggatagc ggtttgactc acggggattt ccaagtctcc accccattga 780 

cgtcaatggg agtttgtttt ggcaccaaaa tcaacgggac tttccaaaat gtcgtaacaa 840 
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ctccgcccca 
agctcgttta 
tagaagacac 
tccccgtgcc 
ttcttatgca 
ataggtgatg 
ctattggtga 
ttattggcta 
aggatggggt 
ccgcagtttt 
acatgggctc 
cagcgactca 
cagcacgatg 
tgaaaatgag 



ggcagaagaa 
cgttgcggtg 
gcgcgccacc 
ctgcagtcac 
cctcccccgt 
atgaggaaat 
ggcagcacag 
gctctatggg 
aggcacatcc 
cactcatagg 
ttggagcggt 
gaagaaatta 
tgaggaagta 
cgctcggtcg 
tccacagaat 
aggaaccgta 
catcacaaaa 
caggcgtttc 
ggatacctgt 
aggtatctca 
gttcagcccg 
cacgacttat 
ggcggtgcta 
tttggtatct 
tccggcaaac 
cgcagaaaaa 
tggaacgaaa 
tagatccttt 
tggtctgaca 
cgttcatcca 
aggtgttgct 
gccacggttg 
tgccacggaa 
agttcgattt 
tacaaccaat 
ttattcatat 
gaaaactcac 
actcgtccaa 
gagaaatcac 
ttccagactt 
aaaccgttat 
ggacaattac 
atattttcac 
gcagtggtga 
ggcataaatt 
ctacctttgc 
attgtcgcac 
tccatgttgg 



ttgacgcaaa 
gtgaaccgtc 
cgggaccgat 
aagagtgacg 
tgctatactg 
gtatagctta 
cgatactttc 
tatgccaata 
ctcatttatt 
tattaaacat 
ttctccggta 
tggtcgctcg 
cccaccacca 
ctcggggagc 
gatgcaggca 
ctgttaacgg 
agacataata 
cgtccttaga 
gccttccttg 
tgcatcgcat 
caagggggag 
tacccaggtg 
ccttctctgt 
acactcatag 
ctctccctcc 
aagcaagata 
atgagagaaa 
ttcggctgcg 
caggggataa 
aaaaggccgc 
atcgacgctc 
cccctggaag 
ccgcctttct 
gttcggtgta 
accgctgcgc 
cgccactggc 
cagagttctt 
gcgctctgct 
aaaccaccgc 
aaggatctca 
actcacgtta 
taaattaaaa 
gttaccaatg 
tagttgcctg 
gactcatacc 
atgagagctt 
cggtctgcgfc 
attcaacaaa 
taaccaattc 
caggattatc 
cgaggcagtt 
catcaataca 
catgagtgac 
gttcaacagg 
tcattcgtga 
aaacaggaat 
ctgaatcagg 
gtaaccatgc 
ccgtcagcca 
catgtttcag 
ctgattgccc 
aatttaatcg 



tgggcggtag 
agatcgcctg 
ccagcctccg 
taagtaccgc 
tttttggctt 
gcctataggt 
cattactaat 
cactgtcctfc 
atttacaaat 
aacgtgggat 
gcggcggagc 
gcagctcctt 
ccagtgtgcc 
gggcttgcac 
gctgagttgt 
tggagggcag 
gctgacagac 
tctgctgtgc 
accctggaag 
tgtctgagta 
gattgggaag 
ctgaagaatt 
gacacaccct 
ctcaggaggg 
ctcatcagcc 
ggctattaag 
tcatagaatt 
gcgagcggta 
cgcaggaaag 
gttgctggcg 
aagtcagagg 
ctccctcgtg 
cccttcggga 
ggtcgttcgc 
cttatccggt 
agcagccact 
gaagtggtgg 
gaagccagtt 
tggtagcggt 
agaagatcct 
agggattttg 
atgaagtttt 
cttaatcagt 
actccggggg 
aggcctgaat 
tgttgtaggt 
tgtcgggaag 
gccgccgtcc 
tgattagaaa 
aataccatat 
ccataggatg 
acctattaat 
gactgaatcc 
ccagccatta 
ttgcgcctga 
cgaatgcaac 
atattcttct 
atcatcagga 
gtttagtctg 
aaacaactct 
gacattatcg 
cggcctcgag 



gcgtgtacgg 
gagacgccat 
cggccgggaa 
ctatagagtc 
ggggtctata 
gtgggttatt 
ccataacatg 
cagagactga 
tcacatatac 
ctccacgcga 
ttctacatcc 
gctcctaaca 
gcacaaggcc 
cgctgacgca 
tgtgttctga 
tgtagtctga 
taacagactg 
cttctagttg 
gtgccactcc 
ggtgtcattc 
acaatagcag 
gacccggttc 
gtccacgccc 
ctccgccttc 
caccaaacca 
tgcagaggga 
tcttccgctt 
tcagctcact 
aacatgtgag 
tttttccata 
tggcgaaacc 
cgctctcctg 
agcgtggcgc 
tccaagctgg 
aactatcgtc 
ggtaacagga 
cctaactacg 
accttcggaa 
ggtttttttg 
ttgatctttt 
gtcatgagat 
aaatcaatct 



tgggaggtct 
ccacgctgtt 
cggtgcattg 
tataggccca 
cacccccgct 
gaccattatt 
gctctttgcc 
cacggactct 



aacaccaccg 
atctcgggta 
gagccctgct 
gtggaggcca 
gtggcggtag 
tttggaagac 
taagagtcag 
gcagtactcg 
ttcctttcca 
ccagccatct 
cactgtcctt 
tattctgggg 
gcatgctggg 
ctcctgggcc 
ctggttctta 
aatcccaccc 



gaggcaccta 

gggggggcgc 

cgccccatca 
ggaccagttg 
atgcgtgatc 
cgtcaagtca 
aactcatcga 
ttttgaaaaa 
gcaagatcct 
ttcccctcgt 
ggtgagaatg 
cgctcgtcat 
gcgagacgaa 
cggcgcagga 
aatacctgga 
gtacggataa 
accatctcat 



ggcgcatcgg 
cgagcccatt 
caagacgttt 



aacctagcct 
gagaaaatgc 
cctcgctcac 
caaaggcggt 
caaaaggcca 
ggctccgccc 
cgacaggact 
ttccgaccct 
tttctcaatg 
gctgtgtgca 
ttgagtccaa 
ttagcagagc 
gctacactag 
aaagagttgg 
tttgcaagca 
ctacggggtc 
tatcaaaaag 
aaagtatata 
tctcagcgat 
tgaggtctgc 
tccagccaga 
gtgattttga 
tgatccttca 
gcgtaatgct 
gcatcaaatg 
gccgtttctg 
ggtatcggtc 
caaaaataag 
gcaaaagctt 
caaaatcact 
atacgcgatc 
acactgccag 
atgctgtttt 
aatgcttgat 
ctgtaacatc 
gcttcccata 
tatacccata 
cccgttgaat 



atataagcag 900 

ttgacctcca 960 

gaacgcggat 1020 

cccccttggc 1080 

tcctcatgtt 1140 

gaccactccc 1200 

acaactctct 1260 

gtatfctttac 1320 

tccccagtgc 1380 

cgtgttccgg 1440 

cccatgcctc 1500 

gacttaggca 1560 

ggtatgtgtc 1620 

ttaaggcagc 1680 

aggtaactcc 1740 

ttgctgccgc 180 0 

tgggtctttt 1860 

gttgtttgcc 192 0 

tcctaataaa 198 0 

ggtggggtgg 2040 

gatgcggtgg 2100 

agaaagaagc 2160 

gttccagccc 2220 

gctaaagtac 2280 

ccaagagtgg 2340 

ctccaacatg 2400 

tgactcgctg 2460 

aatacggtta . 2520 

gcaaaaggcc 2580 

ccctgacgag 2640 

ataaagatac 2700 

gccgcttacc 2760 

ctcacgctgt 2820 

cgaacccccc 2880 

cccggtaaga 2940 

gaggtatgta 3000 

aaggacag t a 3060 

tagctcttga 3120 

gcagattacg 3180 

tgacgctcag 3240 

gatcttcacc 3300 

tgagtaaact 3360 

ctgtctattt 3420 

ctcgtgaaga 3480 

aagtgaggga 3540 

acttttgctt 3600 

actcagcaaa 3660 

ctgccagtgt 3720 

aaactgcaat 3780 

taatgaagga 3840 

tgcgattccg 3900 

gttatcaagt .3960 

atgcatttct 4020 

cgcatcaacc 4080 

gctgttaaaa 4140 

cgcatcaaca 4200 

cccggggatc 4260 

ggtcggaaga 4320 

attggcaacg 4380 

caatcgatag 4440 

taaatcagca 4500 

atggctcata 4560 
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acaccccttg tattactgtt tat'gtaagca gacagtttta ttgttcatga 
ttatcttgtg caatgtaaca tcagagattt tgagacacaa cgtggctttc 
cattattgaa gcatttatca gggttattgt ctcatgagcg gatacatatt 
tagaaaaata aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc 
taagaaacca ttattatcat gacattaacc tataaaaata ggcgtatcac 
cgtc 

<210> 26 
<211> 36 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> oligonucleotide 
<400> 26 

ggtacaagat ctccgccccc atctccccca ttgaga . 3 6 

<210> 27 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> oligonucleotide 
<400> 27 

ccacatagat ctgcccgggc tttagtcctc ate * 33 

<210> 28 

<211> 27 

<212> PRT 

<213> Homo sapien 

<400> 28 

Met Asp Ala Met Lys Arg Gly Leu Cys Cys Val Leu Leu Leu Cys Gly 

15 10 15 

Ala Val Phe Val Ser Pro Ser Glu lie Ser Ser 

20 25 

<210> 29 
<211> 45 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> oligonucleotide 
<400> 29 

caggegagat ctaccatggc ccccattagc cctattgaga ctgta 45 

<210> 30 
<211> 48 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> oligonucleotide 
<400> 30 

caggegagat ctgcccgggc tttaatcctc atcctgtcta cttgccac 48 



tgatatattt 462 0 
cccccccccc 4680 
tgaatgtatt 4740 
acctgacgtc 4800 
gaggeccttt ' 4860 

4864 
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